- I am the founder of Snowtide — which sells PDFTextStream, a PDF text extraction library for Java and .NET – and the creator of the Clojure Atlas. I do a lot of programming in Clojure and just a little in Java.
– Chas Emerick
Twitter Updates
- The Macbook Air is a revelation. SSD OMFG. Should have put an SSD into my MBP forever ago. 1 day ago
- Surprised to report that macvim is 1/4 of the way towards becoming my default editor for everything except serious programming. 2 days ago
- The generified cellular automata implemented by @cgrand step-by-step for @ClojureBook is staggering. #clojure http://t.co/3AHpUZAk 3 days ago
- Many things you think are so important today will be forgotten trivialities tomorrow. Plan accordingly. 3 days ago
- Declaring @github notification bankruptcy for the third time this week. I think I'll just ignore that button entirely from now on. 3 days ago
Search all posts
Topics
- Amazon Web Services (2)
- Announcements (8)
- Asides (2)
- Books (3)
- Boston (1)
- Business (10)
- Clojure (38)
- Clojure Atlas (2)
- Clojure Programming (book) (2)
- Cloud (1)
- couchdb (3)
- Craftsmanship (9)
- devops (3)
- DocuHarvest (1)
- Entrepreneurship (12)
- geek (18)
- History (1)
- Java (6)
- Javascript (1)
- lisp (4)
- Maven (5)
- Open Source (5)
- pallet (3)
- PDFTextStream (22)
- Python (4)
- Random Software Geekery (16)
- Scala (4)
- Uncategorized (8)
- wmassdevs (4)
- WTF (1)
Category Archives: PDFTextStream
Launching DocuHarvest – Turning documents into data
I’m happy today to introduce: Getting valuable data out of documents should not require an I.T. staff, outside consultants, building or buying software, or an up-front investment of hundreds or thousands of dollars, regardless of how many documents and how … Continue reading
Posted in Announcements, DocuHarvest, PDFTextStream
2 Comments
Reducing purchase anxiety is a feature
Talk to anyone outside of the software world, and you’ll quickly realize that one of the most gut-wrenching, anxiety-inducing acts is buying software. Even if one has evaluated the product in question top to bottom, past experience of bugs, botched … Continue reading
Posted in Announcements, Entrepreneurship, Open Source, PDFTextStream
Leave a comment
Activity is not Progress (or, ‘Did you really need to shave that yak’)
Anyone who is accountable for any sufficiently-complex objective is constantly having their focus being pulled away from that larger goal by a thousand different fiddly tasks. Christened as yak shaving some time ago by a fellow at the MIT media … Continue reading
Posted in Craftsmanship, PDFTextStream
1 Comment
Surprising Praise
I happen to work in a particular corner of the software industry that isn’t exactly the most happenin’ party zone. Compared to whatever is “hot” at any point in time, extracting data from documents seems dull to most. I’m not … Continue reading
Posted in PDFTextStream
Leave a comment
New Year’s PDFTextStream Sale!
This morning, we put some limited-time-only discounts into place for PDFTextStream to celebrate the new year. You can now purchase PDFTextStream server deployment licenses for as little as $999 USD (optionally with Premium Support). These licenses carry no CPU restriction, … Continue reading
Posted in PDFTextStream
Leave a comment
Free PDFTextStream for Academic Use
The title says it all. Today we’re announcing that PDFTextStream is free for academic use: read the press release, and if you are a qualifying academic developer, go ahead and apply for a free PDFTextStream license file. Don’t worry, the … Continue reading
Posted in PDFTextStream
Leave a comment
Memory-mapping Files in Java Causes Problems
Today, we released PDFTextStream v2.0.1— a minor patch release that contains a workaround for an interesting and unfortunate bug: on Windows, if one accesses a PDF file on disk using PDFTextStream, then closes the PDFTextStream instance (using PDFTextStream.close()), the PDF … Continue reading
Posted in Java, PDFTextStream
2 Comments
Working Together: Python and Java, Open Source and Commercial
PDFTextStream started out as a Java library, but is now available and supported for Python. How that leap was made exemplifies how commercial and open source software efforts complement each other in the best of circumstances, and is also a fantastic case study in Java + Python integration. Continue reading
Posted in Java, Open Source, PDFTextStream, Python
1 Comment
Software Development and…Pregnancy?
For nearly a year, we have been working on a number of things in parallel: PDFTextStream v2.0 The new snowtide.com website PDFTextOnline, our new AJAX-y PDF text extraction application/service/experiment All three of these things are absurdly complex, and large, and … Continue reading
Posted in Announcements, PDFTextStream
Leave a comment
Automated Quality Control, Part II
In my last post about quality control, I detailed the challenges we face in testing PDFTextStream in order to minimize hard faults, and some of the patchwork testing ’strategy’ that we employed in the early days. Now, I’d like to … Continue reading
Posted in Craftsmanship, geek, PDFTextStream, Random Software Geekery
1 Comment



