2

Like Microsoft Academic Graph, also OpenAlex includes the abstract of articles as inverted indexes:

Object: The abstract of the work, as an inverted index, which encodes information about the abstract's words and their positions within the text. Like Microsoft Academic Graph, OpenAlex doesn't include plaintext abstracts due to legal constraints.

Since it's pretty easy to undo this inverted index and recover the original text, how does this solve the copyright problem?

It's like distributing a copyrighted book after substituting each letter with the next one in the alphabet.

10
  • 1
    I wonder what legal constraints they are talking about as for most academic journals, the abstracts of the articles are freely available. They show you the abstract (along with title and authors) for free precisely to convince you to purchase the rest of the article. Blocking access to the abstract makes no sense to me.
    – quarague
    Commented 2 days ago
  • 2
    Maybe I'm missing something on the technical side, but it seems like there's an obvious answer: an index of a work is not a copy of that work, therefore making an index is not copyright infringement.
    – bdb484
    Commented yesterday
  • @bdb484 an inverted index is the set of scrambled words that compose the work together with instructions to put the words back together in the correct order Commented yesterday
  • 2
    I think you've misunderstood. An inverted index is structured, not scrambled, and while it may identify the positions of each word in the original work, that is not the same as "instructions to put the words back together in the correct order." By either definition, though, it's still not a copy, so it's not a copyright violation.
    – bdb484
    Commented yesterday
  • @bdb484 I literally linked you the code to get back the abstract by its inverted index Commented yesterday

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.