Jan 05

DaleyKlippings v1.02

    This version includes several improvements and bug fixes:

    • Fixed a bug preventing the system from importing a Note that includes multiple lines.
    • Fixed an error in CommaSafe
    • Upgraded import patterns to better handle titles with parentheses (followed by authors with parentheses).  Titles with parenthesis but no author will continue to be processed incorrectly as DaleyKlippings will assume the parentheses surround the author.
    • Improved Author Match
    • Updated several instances where the program still showed the old (Klippings) name.
    Jan 15

    DaleyKlippings v1.0

      With over 100 downloads of v0.7 and no new issues reported, I am removing the “beta” tag off the program.  I’d also like to extend a special thanks to everyone who has donated!

      • There are no functional changes in v1.0
      • Kindle Paperwhite Import Patterns are included

      If you want the Paperwhite patterns and don’t want to bother with the new software (since it currently overwrites preferences), the pattern can be viewed at the Import Pattern for Kindle Paperwhite post.

      Jan 03

      Import Pattern for Kindle Paperwhite 5.3.1

      This pattern addresses another date-time issue.  A file from the Paperwhite was provided that include date-times in the following format:

      Added on Monday, 23 April 12 22:51:41

      While this is roughly the same date-time pattern found in the previous GMT example, it was necessary to make more significant changes to the expression around the Date tag.  In previous patterns we could depend on AM/PM or GMT to clearly indicate the last characters in the date-time.  This pattern has no obvious terminator.  Instead, we had the time tag include everything up to the end-line characters “\r\n”.  The regex pattern [^…] tells the system to include all characters up and until  the characters listed in the area indicated by the ellipsis.

      Notes delimiter:

      ==========

      Notes pattern:

      # Import notes and highlights from "My Clippings.txt" and ignore
      # bookmarks. Warnings with information on ignored bookmarks
      # will be added to the log - this is the app normal behaviour
      
      # Note that VERBOSE and UNICODE options are always on
      
      ^\s* #
      (?P<Book>.*?) # Book name
      (\s*\((?P<Author>[^\(]*)\))? # Author name (optional)
      \s*-\ Your\ #
      (?P<Type>(Highlight|Note|Bookmark)) # Clipping type - 'Highlight' or 'Note'
      (\ on\ Page\ #
      (?P<Page>[\d-]*)\ \|)? # Page (optional)
      (.*(Location|Loc\.)\ #
      (?P<Location>[\d-]*))? # Location (optional)
      .*?Added\ on\ #
      (?P<Date>([^\r\n]*)) # Date & time
      \s* #
      (?P<Text>.*?) # Text
      \s*$ #

      Date Format:

      %A, %d %B %y %H:%M:%S

      Encoding:

      UTF-8 (all languages)
      Dec 21

      Import Pattern for Kindle 3

      The original Kindle 3 pattern had a bug so this improved pattern was included in v0.7.  Note that this version excludes bookmarks.

      Notes delimiter:

      ==========

      Notes pattern:

      # Import notes and highlights from "My Clippings.txt" and ignore
      # bookmarks. Warnings with information on ignored bookmarks
      # will be added to the log - this is the app normal behaviour
      # Note that VERBOSE and UNICODE options are always on
      ^\s*                         #
      (?P<Book>.*?)                # Book name
      (\s*\((?P<Author>[^\(]*)\))? # Author name (optional)
      \s*-\                        #
      (?P<Type>(Highlight|Note))   # Clipping type - 'Highlight' or 'Note'
      (\ on\ Page\                 #
      (?P<Page>[\d-]*)\ \|)?       # Page (optional)
      (.*(Location|Loc\.)\         #
      (?P<Location>[\d-]*))?       # Location (optional)
      .*?Added\ on\                #
      (?P<Date>(.*)(AM|PM))        # Date & time
      \s*                          #
      (?P<Text>.*?)                # Text
      \s*$                         #

      Date Format – This field is left empty because the default matching pattern works with everything we tested.

      
      

      Encoding – While we pick utf-8, most files aren’t encoded this way.  However, (as of v0.7) the system will automatically check utf-16 and windows-1252 if the configured encoding fails.  By selecting utf-8, we catch anything that happens to be encoded in utf-8 before falling back on utf-16 and windows-1252..

      UTF-8 (all languages)
      Dec 21

      DaleyKlippings v0.7

        This version includes several improvements and bug fixes:

        • Kindle 3 pattern upgraded to use all new fields
          • Bug in old Kindle 3 pattern eliminated as a result
        • Improved feedback for Note-Highlight matches
        • Improved logic for file imports which should reduce/eliminate BOM errors
        • Improved errors reporting when the default file encoding (for imports) is not used
        Dec 19

        Import Pattern for Kindle 4 showing GMT, with bookmarks

        Some notes about this pattern before we start. If you just want to use the pattern (included in the installer from v0.6 on), skip down to the notes delimiter header.

        A user had a My Clippings file that include date/times in the following format:

        Added on Monday, 23 April 12 22:51:41 GMT+01:00

        This was a significant difference from the US Kindle Touch and, unfortunately, was unsupported by the library used to parse dates. My solution was to exclude the “GMT+01:00” part of the match on the assumption that DaleyKlippings would correct for it by using the user’s local time. Even if this isn’t true, the difference should be trivial in the greater scheme of things.

        A few other differences are present (especially if you’re using these samples to learn):

        • This pattern excludes the phrase “Your ” (including the trailing space) in front of the word Note/Highlight/Bookmark
        • This pattern includes bookmarks by adding the phrase inside the Type tag, e.g. (Highlight|Note|Bookmark)
        • Location was presented as Loc. so this was added to the matching text as a valid option
        • Note that “GMT” is outside the (?P<Date>…) tag.  This ensures that it is properly matched by the regular expression, but excludes it from the tag.
        • Because we had a two-digit year and there is a bug in our standard date/time object (QDateTime, which uses this formatting), this code uses a pattern compatible with Python’s datetime object (see Python’s datetime object for full documentation).  The code should automatically recognize and use the right parser based on your date/time pattern.

        Notes delimiter:

        ==========

        Notes pattern:

        # Import notes and highlights from "My Clippings.txt" and ignore
        # bookmarks. Warnings with information on ignored bookmarks
        # will be added to the log - this is the app normal behaviour
        
        # Note that VERBOSE and UNICODE options are always on
        
        ^\s*                           #
        (?P<Book>.*?)                  # Book name
        (\s*\((?P<Author>[^\(]*)\))?   # Author name (optional)
        \s*-\                          #
        (?P<Type>(Highlight|Note|Bookmark))     # Clipping types accepted
        (\ on\ Page\                   #
        (?P<Page>[\d-]*)\ \|)?         # Page (optional)
        (.*(Location|Loc\.)\           #
        (?P<Location>[\d-]*))?         # Location (optional)
        .*?Added\ on\                  #
        (?P<Date>(.*))                 # Date & time
        (\ GMT(\+|\-)[0-9:]*)          # Padding to exclude GMT and +/- number
        \s*                            #
        (?P<Text>.*?)                  # Text
        \s*$                           #

        Date Format:

        %A, %d %B %y %H:%M:%S

        Encoding:

        UTF-8 (all languages)
        Dec 19

        DaleyKlippings v0.6

          This version includes several improvements and bug fixes:

          • Added matching logic so location ranges like 245-6 will be processed correctly.
          • Upgraded built-in CSV templates from QuoteSafe to QuoteEscape
          • Added a Kindle 4 importer that supports timestamps ending in GMT+… or GMT-…
          • Added bookmark friendly importers
          • Altered time matching algorithm.  It now supports two modes, Python datetime (if the string includes the % character) or QDateTime (if the string does not).  This bypassed a bug in QDateTime as it was not matching two-digit years.
          Dec 17

          DaleyKlippings v0.4

            This version includes several additional features and bug fixes:

            • Fixed page and location matching logic to reduce the odds of an error when matching in files that include personal documents
            • Changed the SpanXmlSafe prefix to XmlSafeSpan
              • This ensures consistency with best practices under Export Patterns
              • Updated in code and built-in Evernotes pattern
            Dec 17

            Import Pattern for Kindle(R) Touch v5.1.2

            This import pattern has been tested on a Kindle Touch running software version 5.1.2.  It may work on different hardware or older/newer software, but you are encouraged to find (or request) a pattern designed especially for your Kindle.

            The pattern is designed to import Notes and Highlights, but not bookmarks.  It attempts to parse the book name into “Book (Author)”, the standard Kindle format.  This usually does not work on kindle documents and the complete book name (including author if listed) is stored in the book field.

            Notes delimiter:

            ==========

            Notes pattern:

            # Import notes and highlights from "My Clippings.txt" and ignore
            # bookmarks. Warnings with information on ignored bookmarks
            # will be added to the log - this is the app normal behaviour
            
            # Note that VERBOSE and UNICODE options are always on
            
            ^\s*                           #
            (?P<Book>.*?)                  # Book name
            (\s*\((?P<Author>[^\(]*)\))?   # Author name (optional)
            \s*-\ Your\                    #
            (?P<Type>(Highlight|Note))     # Clipping type - 'Highlight' or 'Note'
            (\ on\ Page\                   #
            (?P<Page>[\d-]*)\ \|)?         # Page (optional)
            (.*(Location)\                 #
            (?P<Location>[\d-]*))?         # Location (optional)
            .*?Added\ on\                  #
            (?P<Date>(.*)(AM|PM))          # Date & time
            \s*                            #
            (?P<Text>.*?)                  # Text
            \s*$                           #

            Date Format:

            dddd, MMMM d, yyyy h:mm:ss A

            Encoding:

            UTF-8 (all languages)