I took Zach Fine's updates and made several of my own to bring this up-to-date with changes in plistlib and the Pinboard lib, as well as to make it work using the 'all' command and work better from the command line as a cronjob or shell script with better output and error handling.
My edits/version are in a Gist here https://gist.github.com/samuelkordik/6124c6d4d0d5c090594ae17531e733e8
The script wouldn't run for me, likely due to problems with newer libraries and python3.7. I played whack-a-mole fixing item by item with no knowledge of python other than what I can google. So my changes are messy, and I've removed a test that related to dates. But I'm offering it below in case it's of use to anyone.
I also added one feature -- the items in pinboard will now have the same "date added" value as the original reading list item, rather than all inheriting the date they were synced to pinboard.
To run the script, note that it takes arguments of 'pb', 'md', or 'all' to export to pinboard, markdown file, or both.
Thanks Brett for posting the original script!
#!/usr/bin/python # ReadingListCatcher # - A script for exporting Safari Reading List items to Markdown and Pinboard # Brett Terpstra 2015 # Modified (clumsily) by Zach Fine 2020 # Uses code from <https://gist.github.com/robmathers/5995026> # Requires Python pinboard lib for Pinboard.in import: # `easy_install pinboard` or `pip install pinboard` import plistlib from shutil import copy import subprocess import os from tempfile import gettempdir import sys import atexit import re import time from time import mktime from datetime import date, datetime, timedelta from os import path import pytz DEFAULT_EXPORT_TYPE = 'md' # pb, md or all PINBOARD_API_KEY = 'username:API_KEY' # https://pinboard.in/settings/password BOOKMARKS_MARKDOWN_FILE = '~/Dropbox/Reading List Bookmarks.markdown' # Markdown file if using md export BOOKMARKS_PLIST = '~/Library/Safari/Bookmarks.plist' # Shouldn't need to modify bookmarksFile = os.path.expanduser(BOOKMARKS_PLIST) markdownFile = os.path.expanduser(BOOKMARKS_MARKDOWN_FILE) # Make a copy of the bookmarks and convert it from a binary plist to text tempDirectory = gettempdir() sys.stdout.write('tempDirectory is ' + tempDirectory + '\n') copy(bookmarksFile, tempDirectory) bookmarksFileCopy = os.path.join(tempDirectory, os.path.basename(bookmarksFile)) def removeTempFile(): os.remove(bookmarksFileCopy) #atexit.register(removeTempFile) # Delete the temp file when the script finishes class _readingList(): def __init__(self, exportType): sys.stdout.write('running readinglist \n') self.postedCount = 0 self.exportType = exportType if self.exportType == 'pb': sys.stdout.write('self.exportType=' + self.exportType + '\n') import pinboard self.pb = pinboard.Pinboard(PINBOARD_API_KEY) converted = subprocess.call(['plutil', '-convert', 'xml1', bookmarksFileCopy]) if converted != 0: print('Couldn\'t convert bookmarks plist from xml format') sys.exit(converted) with open(bookmarksFileCopy,'rb') as fp: plist=plistlib.load(fp) # this method of opening the plist no longer works, gotta use plistlib.load (see above) # plist = plistlib.readPlist(bookmarksFileCopy) # There should only be one Reading List item, so take the first one readingList = [item for item in plist['Children'] if 'Title' in item and item['Title'] == 'com.apple.ReadingList'][0] if self.exportType == 'pb': lastRLBookmark = self.pb.posts.recent(tag='.readinglist', count=1) # last = lastRLBookmark['date'] # this test seems to make no items get synced, so I'm bypassing it as I plan to clear my reading list completely after sending to pinboard: last = time.strptime("2013-01-01 00:00:00 UTC", '%Y-%m-%d %H:%M:%S UTC') else: self.content = '' self.newcontent = '' # last = time.strptime((datetime.now() - timedelta(days = 1)).strftime('%c')) last = time.strptime("2013-01-01 00:00:00 UTC", '%Y-%m-%d %H:%M:%S UTC') if not os.path.exists(markdownFile): open(markdownFile, 'a').close() else: with open (markdownFile, 'r') as mdInput: self.content = mdInput.read() matchLast = re.search(re.compile('(?m)^Updated: (\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} UTC)'), self.content) if matchLast != None: last = time.strptime(matchLast.group(1), '%Y-%m-%d %H:%M:%S UTC') last = datetime.strptime(*last[:6]) rx = re.compile("(?m)^Updated: (\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) UTC") self.content = re.sub(rx,'',self.content).strip() if 'Children' in readingList: cleanRx = re.compile("[\|\`\:_\*\n]") for item in readingList['Children']: last_dt = datetime.fromtimestamp(mktime(last)) if item['ReadingList']['DateAdded'] > last_dt: addtime = pytz.utc.localize(item['ReadingList']['DateAdded']).strftime('%c') titletemp = item['URIDictionary']['title'] # title = re.sub(cleanRx, ' ', item['URIDictionary']['title'].encode('utf8')) title = re.sub(cleanRx, ' ', titletemp) title = re.sub(' +', ' ', title) title = title.encode('utf8') # moved encode to the end of processing url = item['URLString'] description = '' if 'PreviewText' in item['ReadingList']: description = item['ReadingList']['PreviewText'] # description = item['ReadingList']['PreviewText'].encode('utf8') description = re.sub(cleanRx, ' ', description) description = re.sub(' +', ' ', description) description = description.encode('utf8') #moved the encode to the end of processing if self.exportType == 'md': self.itemToMarkdown(addtime, title.decode(), url, description.decode()) else: if not title.strip(): #Z need to handle the case of no title as pinboard requires one title='no title' title=title.encode('utf8') post_time=pytz.utc.localize(item['ReadingList']['DateAdded']) self.itemToPinboard(post_time, title.decode(), url, description.decode()) else: break pluralized = 'bookmarks' if self.postedCount > 1 else 'bookmark' if self.exportType == 'pb': if self.postedCount > 0: sys.stdout.write('Added ' + str(self.postedCount) + ' new ' + pluralized + ' to Pinboard') else: sys.stdout.write('No new bookmarks found in Reading List') else: mdHandle = open(markdownFile, 'w') mdHandle.write('Updated: ' + datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S') + " UTC\n\n") mdHandle.write(self.newcontent + self.content) mdHandle.close() if self.postedCount > 0: sys.stdout.write('Added ' + str(self.postedCount) + ' new ' + pluralized + ' to ' + markdownFile) else: sys.stdout.write('No new bookmarks found in Reading List') sys.stdout.write("\n") def itemToMarkdown(self, addtime, title, url, description): sys.stdout.write('running itemToMarkdown \n') self.newcontent += '- [' + title + '](' + url + ' "Added on ' + addtime + '")' if not description == '': self.newcontent += "\n\n > " + description self.newcontent += "\n\n" self.postedCount += 1 def itemToPinboard(self, post_time, title, url, description): sys.stdout.write('running itemToPinboard \n') suggestions = self.pb.posts.suggest(url=url) tags = suggestions[0]['popular'] tags.append('.readinglist') # sys.stdout.write('post_time = ' + post_time + '\n') self.pb.posts.add(url=url, dt=post_time, description=title, \ extended=description, tags=tags, shared=False, \ toread=True) print(title) print('\n') self.postedCount += 1 if __name__ == "__main__": exportTypes = [] if len(sys.argv): for arg in sys.argv: if re.match("^(md|pb|all)$",arg) and exportTypes.count(arg) == 0: exportTypes.append(arg) else: exportTypes.append(DEFAULT_EXPORT_TYPE) for eType in exportTypes: _readingList(eType) sys.stdout.write('running\n')
managed to get this to work. there seems to be an issue with below lines --
lastRLBookmark = self.pb.posts.recent(tag='.readinglist', count=1)
last = lastRLBookmark['date']
Pinboard returns current datetime if '.readinglist' tag doesn't exist yet (true in my case). So I had to pick an old Pinboard bookmark and tag it with '.readinglist'.
Also had to change the first line to --
#!/usr/bin/env python
Script also throws error: 'Syntax Error Expected end of line, etc. but found identifier.'
That's an AppleScript error, I believe, which doesn't make any sense. Running as a workflow or from the command line?
I run as workflow from automator (generic error message) and as script from script editor (where I got this error message). Error refers to line 'import plistlib'.
I'm going to assume it has to do with your Python version. plistlib is available in Python 2.6+. What do you get when you run:
python --version
load more (1 remarks)
Have you considered talking to iCloud directly instead of parsing the plist?
(Really Good safari+pinboard integration is something I've wanted for ages, because Reading List on iOS is so convenient, but I've never had the time+employer permission to finish a project. Glad to see you release this!)
Is iCloud parsing of Reading List possible from a script? If so, I'd be very interested in resources...
Yep! I started, but never had the chance to finish, and employer permission is...difficult to get. It's also not entirely clear if this is OK with Apple (but then again, directly reading the PLIST almost certainly isn't either.) I figured most of this out by using Charles and watching what Safari did when I added and removed reading list entries.
https://github.com/abl/iclo... (based off of another project which appears to have been deleted; I just added bookmark API support.)
You'll need to pip install httplib2 and pydes, at which point (and I just verified this) you can run:
python -i test_bookmark_list.py
Take a look at the 'b' object, which represents all of your Safari bookmarks. Reading List is implemented as a specially named folder (com.apple.ReadingList, probably at b[0].) Unfortunately it looks like Apple changed something, as trying to actually read an individual bookmark yields an assertion error.
https://github.com/picklepe... appears to be an attempt to make a solid Python library for iCloud - they don't support bookmarks, but adding support there might be easier than working with my hack of a hack. :)