Skip to content

Commit

Permalink
use the precompiled flex-based parser from Adium's Auto Hyperlinks fr…
Browse files Browse the repository at this point in the history
…amework instead of NFA regular expressions compiled at runtime
  • Loading branch information
scrod committed Dec 29, 2010
1 parent 5b69179 commit d060842
Show file tree
Hide file tree
Showing 13 changed files with 384 additions and 165 deletions.
99 changes: 72 additions & 27 deletions Acknowledgments.txt
Original file line number Diff line number Diff line change
Expand Up @@ -338,31 +338,76 @@ http://www.opensource.org/licenses/bsd-license.php

--

CocoaICU

Copyright (c) 2005-2006, Aaron Evans
* All rights reserved.
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of Aaron Evans, nor the names of its contributors may
* be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS ``AS IS''
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL THE COYPRIGHT HOLDERS AND CONTRIBUTORS AND
* CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
* EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
* OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
* WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
* OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
* ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
AutoHyperlinks framework

/*
* The AutoHyperlinks Framework is the legal property of its developers (DEVELOPERS), whose names are listed in the
* copyright file included with this source distribution.
*
* Copyright (c) 2004-2008
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of the AutoHyperlinks Framework nor the
* names of its contributors may be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY ITS DEVELOPERS ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL ITS DEVELOPERS BE LIABLE FOR ANY
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
* ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

Adium (AutoHyperlinks framework)
Copyright (C) 2001-2005 by the following:

Adam Atlas
Colin Barrett
Erik J. Barzeski
Max Bertrand
Adam Betts
Jorge Salvador Caffarena
David Clark
Nelson Elhage
Ken Ferry
Christopher Forsythe
Brian Ganninger
Arno Hautala
Asher Haig
Jasper Hauser
Stephen Holt
Adam Iser
Severin Klaus
Ian Krieg
Thomas Kunze
Scott Lamb
Jack M.H. Lin
Sam McCandlish
Nicola Del Monaco
David Munch
Daisuke Okada
Mac-arena the Bored Zo
Chris Serino
Jeffrey Melloy
Roeland Nas
Laura Natcher
Daisuke Okada
Stephen Poprocki
Evan Schoenberg
David Smith
Greg Smith
Vinay Venkatesh
Wesley Underwood

75 changes: 24 additions & 51 deletions AttributedPlainText.m
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@
#import "AttributedPlainText.h"
#import "NSCollection_utils.h"
#import "GlobalPrefs.h"
#import "ICUPattern.h"
#import "ICUMatcher.h"
#import "NSString_NV.h"
#import <AutoHyperlinks/AutoHyperlinks.h>


@implementation NSMutableAttributedString (AttributedPlainText)

Expand Down Expand Up @@ -168,61 +168,34 @@ - (BOOL)restyleTextToFont:(NSFont*)currentFont usingBaseFont:(NSFont*)baseFont {

- (void)addLinkAttributesForRange:(NSRange)changedRange {

if (!changedRange.length) return;
if (!changedRange.length)
return;

static ICUPattern *urlPattern = nil;
//This regexp modeled on John Gruber's patterns: http://daringfireball.net/2010/07/improved_regex_for_matching_urls
if (!urlPattern) urlPattern = [ICUPattern patternWithString:
@"(?i)\\b((?:[a-z][\\w-]+:/{2,3}|www\\d{0,3}[.]|[a-z0-9.\\-]+[.][a-z]{2,4}/)(?:[^\\s()<>\\[\\]]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:'\".,<>?«»“”‘’]))"];
//For a heavier-duty implementation, Adium's AutoHyperlinks framework (based on flex) might be better
//lazily loads Adium's BSD-licensed Auto-Hyperlinks:
//http://trac.adium.im/wiki/AutoHyperlinksFramework

static ICUPattern *emailPattern = nil;
//use a separate regexp for email addresses, eschewing any that contain an inner colon
if (!emailPattern) emailPattern = [ICUPattern patternWithString:@"(\\w+([-+.']\\w+)*?@(?>\\w+([-.]\\w+)*?\\.\\w+([-.]\\w+)*))(?=[^:]|:*?($|\\s))"];

[self beginEditing];
@try {
NSMutableIndexSet *urlIndexes = [NSMutableIndexSet indexSet];

ICUMatcher *matcher = [ICUMatcher matcherWithPattern:urlPattern overString:[self string] range:changedRange];

while ([matcher findNext]) {
NSRange range = [matcher rangeOfMatch];
NSString *extractedMatch = [[self string] substringWithRange:range];
[urlIndexes addIndexesInRange:range];

NSURL *url = [NSURL URLWithString:extractedMatch];
if (![[url scheme] length]) {
//if the parsed URL lacks an explicit protocol specifier, just assume it's http
url = [NSURL URLWithString:[@"http://" stringByAppendingString:extractedMatch]];
}
//File Reference URLs cannot be safely archived!
if (url && !([url isFileURL] && [extractedMatch rangeOfString:@"/.file/" options:NSLiteralSearch].location != NSNotFound))
[self addAttribute:NSLinkAttributeName value:url range:range];
static Class AHHyperlinkScanner = Nil;
static Class AHMarkedHyperlink = Nil;
if (!AHHyperlinkScanner || !AHMarkedHyperlink) {
if (![[NSBundle bundleWithPath:[[[NSBundle mainBundle] privateFrameworksPath] stringByAppendingPathComponent:@"AutoHyperlinks.framework"]] load]) {
NSLog(@"Could not load AutoHyperlinks framework");
return;
}

matcher = [ICUMatcher matcherWithPattern:emailPattern overString:[self string] range:changedRange];
while ([matcher findNext]) {
NSRange range = [matcher rangeOfMatch];

//don't make links if part of the range was already matched as a URL
if (![urlIndexes intersectsIndexesInRange:range]) {
NSURL *url = [NSURL URLWithString:[@"mailto:" stringByAppendingString:[[self string] substringWithRange:range]]];
if (url) [self addAttribute:NSLinkAttributeName value:url range:range];
}
}

//NEXT: add [[ ]] url-links?


AHHyperlinkScanner = NSClassFromString(@"AHHyperlinkScanner");
AHMarkedHyperlink = NSClassFromString(@"AHMarkedHyperlink");
}
@catch (NSException *e) {
NSLog(@"Failed adding link attributes for %u-char string: %@", [self length], e);
}
@finally {
[self endEditing];

id scanner = [AHHyperlinkScanner hyperlinkScannerWithString:[[self string] substringWithRange:changedRange]];
id markedLink = nil;
while ((markedLink = [scanner nextURI])) {
NSURL *markedLinkURL = nil;
if ((markedLinkURL = [markedLink URL])) {
[self addAttribute:NSLinkAttributeName value:markedLinkURL
range:NSMakeRange([markedLink range].location + changedRange.location, [markedLink range].length)];
}
}

//also detect double-bracketed URLs here
}


Expand Down
1 change: 1 addition & 0 deletions AutoHyperlinks.framework/AutoHyperlinks
1 change: 1 addition & 0 deletions AutoHyperlinks.framework/Headers
1 change: 1 addition & 0 deletions AutoHyperlinks.framework/Resources
Binary file not shown.
152 changes: 152 additions & 0 deletions AutoHyperlinks.framework/Versions/A/Headers/AHHyperlinkScanner.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
/*
* The AutoHyperlinks Framework is the legal property of its developers (DEVELOPERS), whose names are listed in the
* copyright file included with this source distribution.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of the AutoHyperlinks Framework nor the
* names of its contributors may be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY ITS DEVELOPERS ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL ITS DEVELOPERS BE LIABLE FOR ANY
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
* ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

#import "AHLinkLexer.h"

typedef void* yyscan_t;

extern long AHlex( yyscan_t yyscanner );
extern long AHlex_init( yyscan_t * ptr_yy_globals );
extern long AHlex_destroy ( yyscan_t yyscanner );
extern long AHget_leng ( yyscan_t scanner );
extern void AHset_in ( FILE * in_str , yyscan_t scanner );

typedef struct AH_buffer_state *AH_BUFFER_STATE;
extern void AH_switch_to_buffer(AH_BUFFER_STATE, yyscan_t scanner);
extern AH_BUFFER_STATE AH_scan_string (const char *, yyscan_t scanner);
extern void AH_delete_buffer(AH_BUFFER_STATE, yyscan_t scanner);

@class AHMarkedHyperlink;

@interface AHHyperlinkScanner : NSObject
{
NSDictionary *m_urlSchemes;
NSString *m_scanString;
NSAttributedString *m_scanAttrString;
BOOL m_strictChecking;
NSUInteger m_scanLocation;
NSUInteger m_scanStringLength;
}


/*!
* @brief Allocs and inits a new lax AHHyperlinkScanner with the given NSString
*
* @param inString the scanner's string
* @return a new AHHyperlinkScanner
*/
+ (id)hyperlinkScannerWithString:(NSString *)inString;

/*!
* @brief Allocs and inits a new strict AHHyperlinkScanner with the given NSString
*
* @param inString the scanner's string
* @return a new AHHyperlinkScanner
*/
+ (id)strictHyperlinkScannerWithString:(NSString *)inString;

/*!
* @brief Allocs and inits a new lax AHHyperlinkScanner with the given attributed string
*
* @param inString the scanner's string
* @return a new AHHyperlinkScanner
*/
+ (id)hyperlinkScannerWithAttributedString:(NSAttributedString *)inString;

/*!
* @brief Allocs and inits a new strict AHHyperlinkScanner with the given attributed string
*
* @param inString the scanner's string
* @return a new AHHyperlinkScanner
*/
+ (id)strictHyperlinkScannerWithAttributedString:(NSAttributedString *)inString;

/*!
* @brief Determine the validity of a given string with a custom strictness
*
* @param inString The string to be verified
* @param useStrictChecking Use strict rules or not
* @param index a pointer to the index the string starts at, for easy incrementing.
* @return Boolean
*/
+ (BOOL)isStringValidURI:(NSString *)inString usingStrict:(BOOL)useStrictChecking fromIndex:(NSUInteger *)index withStatus:(AH_URI_VERIFICATION_STATUS *)validStatus;

/*!
* @brief Init
*
* Inits a new AHHyperlinkScanner object for a NSString with the set strict checking option.
*
* @param inString the NSString to be scanned.
* @param flag Sets strict checking preference.
* @return A new AHHyperlinkScanner.
*/
- (id)initWithString:(NSString *)inString usingStrictChecking:(BOOL)flag;

/*!
* @brief Init
*
* Inits a new AHHyperlinkScanner object for a NSAttributedString with the set strict checking option.
*
* param inString the NSString to be scanned.
* @param flag Sets strict checking preference.
* @return A new AHHyperlinkScanner.
*/
- (id)initWithAttributedString:(NSAttributedString *)inString usingStrictChecking:(BOOL)flag;


/*!
* @brief Determine the validity of the scanner's string using the set strictness
*
* @return Boolean
*/
- (BOOL)isValidURI;

/*!
* @brief Returns a AHMarkedHyperlink representing the next URI in the scanner's string
*
* @return A new AHMarkedHyperlink.
*/
- (AHMarkedHyperlink *)nextURI;

/*!
* @brief Fetches all the URIs from the scanner's string
*
* @return An array of AHMarkedHyperlinks representing each matched URL in the string or nil if no matches.
*/
- (NSArray *)allURIs;

/*!
* @brief Scans an attributed string for URIs then adds the link attribs and objects.
* @param inString The NSAttributedString to be linkified
* @return An autoreleased NSAttributedString.
*/
- (NSAttributedString *)linkifiedString;

- (NSUInteger)scanLocation;
- (void)setScanLocation:(NSUInteger)location;

@end
Loading

0 comments on commit d060842

Please sign in to comment.