Short hints on CSS pseudo-selectors and some reference sites. See how easy it is to set different styles based on element relationships using CSS.
PS: I guess you know these already so this post is not for CSS gurus!
Not much to write here except to point you to the original post about the text zones extraction, a post that was quite popular back in the days.
It's time to back it up with code … 5ubliminal style:)

No such un-exhaustible commons! But if you drink alone it lasts longer. Share it and you won't know what hit you when it ends.
Normalizing special characters from languages like: Spanish and French to their ASCII equivalents is not a really easy task in PHP. My goal is to turn all characters that resemble a t (Ţ) to a t and all those that resemble a c (Ç) to a c and all those that resemble an o (Ŏ) to an o and so on … you get the idea.
And the provided code will do just this!
Short while back I published a tutorial on how to easily find and extract text blocks within HTML. Time has come to evolve. We need to dig deeper and get the sentences out of the text chunks.

The basics of web-scraping plus some practical advice on how to do it right and get what you have been aiming for.
This page will point you to the article content scraping scripts I published. Bookmark it as any article educational scraping script will be linked here.
Free articles for the taking should be easy to access and retrieve. I made this script especially to demonstrate you how to use PHP and cUrl in order to get the content you need from ArticleInsert.com.
GoArticles.com is a huge databse with nonsense articles published only for SEO porpouses. Worst of all, the site is havily cloaked and I will provide an educational script to uncloak it!
