Exporting tagged articles from Google Reader

Google Reader for me was never just a way to plow through a large number of feeds, but also a database of important articles that I lovingly annotated with tags and notes when they still existed. Apparently I was in minority, judging by Reader clients and more importantly its own exporting tools which lets you take away most of your stuff, but not tagged items themselves.

Hence I wrote a quick and dirty Python script which allows you to do just that using libgreader. You can find it on Github and it has few other features like exporting all feed articles. Who knows how long Feedburner will be around so next step will be resolving those links.

My current backup amounts to almost 500MB so script is obviously useful to me. Hopefully it is also to others.  If you find bugs or data that is not exported, but should be, please do let me know.

I am also looking for a good alternative that is not hosted only service, supports archiving and can process most feeds. Currently I am biased to modifying Newsblur to support tagging and running my own instance, but I would definitely prefer to avoid this work if possible.

  1. I switched to Tiny Tiny RSS, which was an easy install and the source is on github. It is an easy install and the android app is quite nice too.
    It turned out to be so nice that I should have switched earlier.

    Comment by Christof #
  2. I have been using Tiny Tiny RSS for years now. Give it a try, it fits your “want’s” quite well.

    Comment by Thomas — #
  3. @Christof and @Thomas: Thanks, I’ll give it a go. I first passed it because I thought it was very finicky about parsing feeds.

    @Brandon: I know, but one thing missing on the list of exported data are items that were tagged which was the main reason why I developed this script. It also simplifies downloading an archive of blog posts for websites that don’t exist anymore.

    Comment by markos #
  4. I’m having the exact same problem and this seems like a great solution. I have so many research articles categorized in tags. How are you going to use the tag data once it is exported? Have you found an RSS replacement yet?

    Comment by Matthew Parry — #
  5. I don’t know yet, but I plan to import articles into next system. I am not sure what my replacement will be (for now), but I will likely go with Tiny Tiny RSS, which already has tags, or NewsBlur, which doesn’t but is for me easier to extend. I’d go with hosted service only if it also has a self-hosted option (like Newsblur).

    I don’t plan to use either of them very long. While they seem to be mature, good products, neither of them really fits my needs all that well. What I want is more than a feed reader with an archive. I need a reader into which I can bring other sources of information, where I can better annotate articles, looking up information is easier and has better sharing capabilities (I miss feeds of tagged items). I also need it work cross-devices and on low-bandwidth networks.

    So after years of deliberation I plan to start working on my own open source reader this summer. I’m not sure yet how quickly I’ll be able to have something usable, but full application obviously will take some time.

    Comment by markos #
  6. I use reader the exact same way and have been for over 6yrs. My personal curated database of knowledge. I also want to be able to share an entire feed of my tagged articles. I really miss the ability to add notes that was taken away by Google recently as well as the note in reader bookmarklet that allowed me to bring random web clippings into reader then tag and notate.

    Comment by Brian — #
  7. Hello MarKos
    I’m a power user of google reader and use it for years. for weeks i’m searching for scripts to backup tagged articles. I have many of them! and here it is,u have it!u r the best markos and very appreciate for ur work.

    but just one problem, dont know how to use ur script. I downloaded g reader hoover master and libgreader 0.6.3.

    what’s next? i’m not familiar with these stuff :( could u plz explain me step by step?
    thanx for ur time and ur script

    Comment by Amir Emami — #
  8. I also install phyton. I can provide team viewer too.
    plz help me Markos.

    Comment by Amir Emami — #
  9. Thanks, Amir. I realize my instructions are difficult to follow for someone who is not a Python developer (and maybe even for them).

    I have sent you an email. I will also update instructions on github. Soon, I hope.

    Comment by markos #
  10. God bless u Markos, thank u for spending ur time on that :)

    Comment by Amir Emami — #
  11. Hey Markos, thanks for this script..keep wondering why there is nothing else that would extract the tags from reader but luckily we have you. Same problem as Amir though. Can you forward the mail you send him :D

    Comment by NomenNescia — #
  12. Thank you! It’s so annoying that Google didn’t think of the people like us who actually used the features they put in Reader.

    I’m in the same boat as the Amir and Nomen – I’m not so tech-savvy as to know exactly how to use the script. Instructions would be helpful!

    Comment by Tia — #
  13. Instructions are definitely lacking and together with Amir we found a few issues with installing them on Windows.

    I am finally back from my business trip so I expect to have some time this weekend to resolve these issues and provide an easier and better documented way of installing and running my script.

    Thanks for the patience.

    Comment by markos #
  14. Thank you, thank you for sharing this information. I have been walking around in a fog since the news about GReader’s impending decommission. With the hopes that you’ll be able to work out any sticking points with a Windows install, I look forward to learning more. I would love to get the instruction by email or however you decide to post it. Bless you for sharing! :)

    Comment by AnnM — #
  15. Hello guyz
    the best way to back up tagged articles till now:
    1- make tag public in greader
    2- use this link
    http://www.google.com/reader/public/atom/user%2F01597317633809385927%2Flabel%2FArt?n=1000 (replace ur greader id with 01597317633809385927 and ur tag name with Art)
    3- download newzcrawler
    4- add the link to channel newsfeed and turn automatic update off
    this is the best way i figured out.

    Comment by Amir Emami — #
  16. Marko,

    Thank you for a great python script. Our tagged articles are saved in JSON. Now what … have you been able to upload the tagged articles into any other RSS reader?

    Thanks, Clay

    Comment by clay — #
  17. No, did not get around to it yet. I since switched to Tiny Tiny RSS which seems to have support for importing articles, but still need to write a script that will translate my JSON to tt-rss. I started work on resolving feedburner links in case that also disappears.

    Which reader would you like to see supported?

    I’ve been also thinking about creating also HTML export which could make export more useful also to those who won’t or can’t import articles elsewhere.

    Comment by markos #
  18. Which reader would I like to see supported? A reader that tags articles like Google Reader … does. not. exist.

    I envision a bifurcated solution: articleReader for reading and articleTagger for tagging.

    articleReader: like newsblur, feedly, ttr …
    articleTagger: something like pinboard for tagged articles …

    Until then, will be importing json tagged articles into firefox/chrome bookmarks.

    Comment by clay #
  19. My own personal solution has now evolved to be Tiny Tiny RSS for the frontend and gmail for tagging and archival storage. If I like an article I click the share button and send it to a gmail account for review at a later point. Gmail supports tagging and also will probably not disappear anytime soon.

    Comment by Knarf — #

