ClippyKindle package¶
Submodules¶
ClippyKindle.DataStructures module¶
-
class
ClippyKindle.DataStructures.Book(title, author='')¶ Bases:
objectData structure for storing all highlights/notes/bookmarks for a given book.
-
__init__(title, author='')¶ Initialize a Book object.
- Parameters
title (str) – Title of the book.
author (str) – Optional; Author of the book.
-
cutAfter(cutDate)¶ removes all data in Book object that was modified on or after the provided timestamp :param cutDate: cutoff date for preserving data in this Book :type cutDate: datetime.datetime
- Returns
None
-
cutBefore(cutDate)¶ removes all data in Book object that was modified on or before provided timestamp :param cutDate: cutoff date for preserving data in this Book. :type cutDate: datetime.datetime
- Returns
None
-
static
fromDict(d)¶ - Returns
A new Book object populated with the values from a provided dict (e.g. read from a JSON file)
-
getDateRange()¶ retrieves the datetime of the earliest and latest item (note, highlight, or bookmark) stored in this book (the datetime of an item is the timestamp at which it was originally added to the book)
- Returns
(tuple of datetime.datetime objects) first object in tuple is the earliest date, second is the latest (if book has no items, earliest will be returned as None, and the latest as the datetimes at epoch 0)
-
getName()¶ returns a string containing the book’s title and author (if known) e.g. “How to Live on 24 Hours a Day by Arnold Bennett.md”
-
sort(removeDups)¶ sorts arrays self.highlights, self.notes, and self.bookmarks. Each array is stored by (increasing) location in the book (ties are broken by the date recorded) optionally removes duplicates within each array
- Parameters
removeDups (bool) – set True to remove suspected duplicates within self.notes, self.highlights, self.bookmarks. the oldest item in each set of duplicates is the one preserved (the one last modified)
- Returns
None
-
toCSV()¶ converts this book object to a CSV file (columns sorted by location in book increasing) :returns: Array of lists representing each row (can be written to csv file later).
-
toDict()¶ converts this book object to a dict (which can be jsonified later) :returns: A dict storing all the data in this book. :rtype: (dict)
-
-
class
ClippyKindle.DataStructures.Bookmark(loc, locType, date)¶ Bases:
objectData structure for storing info about a single bookmark
-
__init__(loc, locType, date)¶ Bookmark class constructor
- Parameters
loc (int) – page or location value this note was made at
locType (str) – “page or “location” (identifies what location type this highlight uses)
date (datetime.datetime) – date this highlight was made
-
static
fromDict(d)¶ - Returns
A new Bookmark object populated with the values from a provided dict (created with toDict())
- Return type
(Bookmark)
-
isDuplicate(other)¶ returns true if provided Bookmark object can be considered a duplicate of this object
- Parameters
other (Bookmark) – other Bookmark object to compare this object to
- Returns
true or false.
- Return type
(bool)
-
toDict()¶ - Returns
Dict representing this object.
- Return type
(dict)
-
-
ClippyKindle.DataStructures.GCS(string1, string2)¶ - Returns
The greatest (longest) common substring between two provided strings (returns empty string if there is no overlap)
- Return type
(str)
-
class
ClippyKindle.DataStructures.Highlight(loc, locType, date, content)¶ Bases:
objectData structure for storing info about a single highlight
-
__init__(loc, locType, date, content)¶ Highlight class constructor
- Parameters
tuple (loc) – (int locStart, int locEnd)
locType (str) – “page or “location” (identifies what location type this highlight uses)
date (datetime.datetime) – date this highlight was made
content (str) – book text stored in this highlight
-
static
fromDict(d)¶ - Returns
A new Highlight object populated with the values from a provided dict (created with toDict()).
-
isDuplicate(other, fuzzyMatch=True)¶ returns true if provided Highlight object can be considered a duplicate of this object
- Parameters
other (Highlight) – other Highlight object to compare this object to
fuzzyMatch (bool) – true if we should consider Highlights with overlapping content (but not exactly the same) to be duplicates (default: True)
- Returns
true or false.
- Return type
(bool)
-
toDict()¶ - Returns
A dict representing this object.
- Return type
(dict)
-
-
class
ClippyKindle.DataStructures.Note(loc, locType, date, content)¶ Bases:
objectData structure for storing info about a single note
-
__init__(loc, locType, date, content)¶ Note class constructor
- Parameters
loc (int) – page or location value this note was made at
locType (str) – “page or “location” (identifies what location type this highlight uses)
date (datetime.datetime) – date this highlight was made
content (str) – text contents of the note
-
static
fromDict(d)¶ - Returns
A new Note object populated with the values from a provided dict (created with toDict())
-
isDuplicate(other, fuzzyMatch=True)¶ returns true if provided Note object can be considered a duplicate of this object
- Parameters
other (Note) – other Note object to compare this object to
fuzzyMatch (bool) – true if we should consider Notes with overlapping content (but not exactly the same) to be duplicates (default: True)
- Returns
true or false
- Return type
(bool)
-
toDict()¶ - Returns
A dict representing this object
- Return type
(dict)
-
-
ClippyKindle.DataStructures.sortDictList(arr)¶ helper function for sorting a list of objects (representing Hightlight/Note/Bookmark objects) in order by (increasing) page/location within the book (ties broken by date recorded).
- Parameters
arr (list of dict objects) – list of dicts that contain (at least) the fields “loc” and “dateStr” (these dicts should have created by a call of toDict())
- Returns
original list of dicts except now reordered
- Return type
(list of dict objects)
Module contents¶
-
class
ClippyKindle.ClippyKindle¶ Bases:
objectDoes the work of parsing either a “My Clippings.txt” file or a previously exported JSON file created from a “My Clippings.txt” file. (using classes from ClippyKindle.DataStructures for storage)
-
static
parseClippings(fname, verbose=False)¶ parses the notes/highlights/bookmarks stored in a kindle clippings txt file (printing any errors) and returns the data as an array of dicts (each dict representing the data from one book).
- Parameters
fname (str) – file path to txt file to parse (e.g. “My Clippings.txt”)
TODO (#) – use verbose param with options 0 (print nothing), 1 (print everything), and 2 (print errors only)
- Returns
type listOfObjects: DataStructures.Book) list of Book objects
- Return type
(
-
static
parseJsonFile(fname)¶ parses the notes/highlights/bookmarks stored in a JSON file previously created with ClippyKindle returns an array of Book objects
- Parameters
fname (str) – file path to json file to parse (e.g. “collection.json”)
- Returns
type listOfObjects: DataStructures.Book) list of Book objects
- Return type
(
-
static
-
ClippyKindle.dateToStr(dateObj)¶ converts a provided dateTime object to a string with desired formatting
-
ClippyKindle.strToDate(dateStr)¶ converts a provided string (of desired formatting) to a dateTime object