Download	Purchase	Support	Tools

Requirements

Windows Vista/7/8/10
64 MB RAM
100 MB Hard Disk Space
Stable Internet Connection

WDE Versions

WDE Pro Versions

What are people saying?

Student License

Recently I was looking for a url extractor program because I am building a search engine and I have tried near 20 of them, all with some missing features. But WDE seems to be the best one right now. I am just a student and the normal price is way high for me. Do you have any student license to offer? I would really appreciate it.

Best Regards, Jeffrey Richardson, Sweden

Web Addresses

We downloaded and ran the trial version of your web link extractor. I compared it to another email extractor program and yours kicked it's butt. Your's scanned 9000 files while finding over 1500 links vs. the other only scanned 1200 file, and found only about 400 links. (This was using the exact same search file).

Thanks, Mark Jeter

Email List Management

A perfect tool for email marketing mailing lists creation, processing and management.
ListMotor

Versions Web Data Extractor

v8.3 (Released 27.12.2011):

Added showing "No emails found" at Emails page
Added duplicates removing when loading URLs from a file
Fixed a few hangs/deadlocks
Added filter combobox at Emails page

v8.2 (Released 06.07.2011):

Updated search engines list
Added button "Translate" for "Search keywords:" area. This button allows to translate typed keywords into other languages by using Google Translate
Added button "Export of domains without any emails" in result tab "Emails"
Stability improvements
Various fixes

v8.1 (Released 27.12.2010):

Reworked multi-threading engine.
Improved URL-parser
Various fixes

v8.0 (Released 01.07.2010):

Improved search engines processing
Improved search of non-standard HTML formatted sites
Improved data autodection (UTF-8, header, etc)
Added advances network/connection processing (Human factor option, incorrect server responses, etc)
Windows 7 support was added
Various fixes

v7.3 (Released 17.06.2009):

Improved sites encoding processing
Site/Directory/Groups length limitation is now removed
Multi-threading engine is improved and now more stable
Some improvements for Email filter settings layout
Phone and fax scraper improved. The algorithm is more accurate now
Fixed some GUI tweaks
Improved crawling for sites with non-valid content or wrong server responses
Fixed "Stay with full url" and "Follow offsite links" options which failed for some sites before

v7.2 (Released 10.04.2009):

We are continuously improving our product. This version contains improved algorithms to find emails, phones and faxes, based on our users experience and feedbacks
Set of common used phone/fax prefixes extended
User defined email scraping prefixes are added
More powerfull Intelligent spidering, it supports all user defined prefixes now
Improved search engines processing
Smart detection of the binary data
User interface tweaks and changes, including Tray icon and more

v7.1 (Released 15.04.2008):

Phone and fax numbers search algorithm has been improved
The function that allows to create a new session was added. The peculiarity of this function is the installation of standard settings after the reset of search parameters
If you define keywords while searching on the site or on search engines the new function is provided. A column with keywords based on which the search was performed appears in results table. Each found link conforms to the keyword that appeared with that link on the webpage in search
The search algorithm of the links represented as table elements has been improved. The following situation is possible: the prefix is located in one table cell and the number in another. The previous software version could have skipped such a phone number. Now this defect has been fixed
The function performance allowing to switch to external links has been fixed
Vista support was added
The possibility to edit program settings without clearing search results was added
The set of the existing search engines was updated and enhanced, search engines for more countries were added, all non-existent search engines were deleted
The engine was revised and optimized
Small bugs were fixed

v6.0 (Released 17.04.2007):

New design
Database supports
Faster parsing
Updated search engines
Many other improvements

v5.0 (Released 06.07.2006):

New interface
New parsing
Ability to reopen session with extracted data
Updated search engines
Many other improvements

v4.3 (Released 20.09.2005):

Powerful email filtering functions
Updated search engine listing
Extracting urls from search results only and other improvements

v4.0 (Released 11.12.2004):

Search Engine support of 30 countries
Page body extraction module
Keyword generator
Url generator
Intelligent spidering mode to extract more data without spidering entire site
Set 'meta tag size' limit option

Versions Web Data Extractor Professional

v3.10 (Released 06.01.2020):

Significantly improved parser of email addresses
User agents list has been updated
Added "Retry non-extracted URLs" and "Enhanced Human factor" options in Connection for even more effective work with target websites
Added options "Check each X minutes" and "Renew after it has read Y number of links" in «Proxy Servers» for more effective work with proxies
Many improvements have been made according to our customers reviews!

v3.9 (Released 30.12.2018):

List of search engines is cleared of outdated/broken links. This allowed us to increase the speed of the software in «Search engines» mode
Significantly improved email addresses parser, especially for JS (JavaScript) hidden emails
Improved option to import own proxy servers from CSV files
Improved work with HTTPS websites
Improved performance when working with large URL lists
Improved "Cookie Capture" option
Various minor fixes/improvements according to customers’ feedbacks

v3.8 (Released 29.12.2017):

Added ability to load and extract information from PDF files
Added ability to load the license file directly from the UI form, when the trial period of using the program expires. Alternatively, the license file can be uploaded from the Options -> About form if the trial period has not yet expired
Significantly improved work through the proxy servers
Parser of encoded JS-emails has been improved
The context menu item "Re-start URL" was added to the "Bad URLs" list
Improved work with the software internal data repository
Added the ability to delete sessions along with all it’s data and the service files, also software automatically compress the internal repository of the program to reduce the required disk space
Added "Initial Referer" text field in UI. Some websites may display different information depending on which external site they come from. The "Initial Referer" field allows you to specify the web address of such a site
We also made various minor changes and improvements based on feedbacks from our customers

v3.7 (Released 28.02.2017):

Improved work of "Search Engines" mode
Improved "Remove HTML Tags" and "Page must contain the following text to extract data" filters
Added "Use country IP filter" filter which allows to exclude results of servers which does not related (by geolocation) to country selected in "Search Engines» option
Significantly improved email parser and «Custom Builder» parser
General improvements in data detection and extraction
We also made various minor changes and improvements based on feedbacks from our customers

v3.6 (Released 22.08.2016):

Added checkbox "Get redirected URL" on the "Custom Data Editor" form to extract urls (e.g. website addresses) that are presented through a redirect
Added checkbox "Mark Non-Responding Proxies Like Inactive Automatically". If during the session proxy server determined as «bad» (not working), it is automatically marked as inactive, and it’s not used in the session
Added new option "Use single line merge" to merge data into a single string. For example, you can export t-shirt colors like: "T-Shirt", "Black, Yellow, Red, Green"
Significantly improved loading of public proxy servers from the Internet
"Human Factor" option has been improved
Improved a parser of closed by JS email adresses
Improved option of passing Google-captcha when searching data via Google
We also made various minor changes and improvements based on feedbacks from our customers

v3.5 (Released 28.10.2015):

Significantly improved mechanism of searching data through search engines (added a mechanism to work with Google captcha etc.)
Added the ability to capture cookies (new button «Capture Cookie») and run a session with cookies (it is very useful in cases where the parameters of the search forms through cookies)
Added ability to import a proxy servers from the service where laid out fresh proxies every 30 minutes. Imports about 100-140 proxies. Each new import changes the earlier downloaded list. During the session, the server which became 100% inoperative, will automatically become inactive so in the list remain only actual servers
Added new parser to decrypt hidden by javascript email addresses
Revised and improved server errors handling, which has a positive impact on work through proxy servers
Improved email/fax adresses parser
Various minor improvements

v3.4 (Released 03.09.2015):

Improved parser of javascript protected email adresses, added 2 new decoders
Improved algorithm for merge the data for export
Added checkbox "Add in results" in filter "URL Filter: Page must contain the following text to extract data". If you turn it on, then the results table will have with the keywords of this filter, that satisfy the search criteria when retrieving data
Improved parser of links, added case that cover not quoted links in the page sources
Software improved for work with large data
Improved export data mechanism
Improved filter mode "Url List" work
Added recognition of servers that do not support the issuance of uncompressed content and a form correct request to such servers
Added new search engine - IXQUICK. It does not safe IP and searches in main search engines. With this engine you can spider for days without beeing blocked
Fixed "Object null reference" issue
Various minor additions/fixes

v3.3 (Released 05.05.2015):

Improved parser of javascript protected email adresses
Improved handling of network errors. Now better recognized temporarily unavailable pages, for example due to high activity on the server
Added use of regular expressions in filters. To recognize a regular expression, please enclose it between the symbols "^" and "$"
Added detection of specific symbols of the German language in urls
Added "Recovery" button in the settings. It allows you to export all the collected data for the selected date range, even if the main database of the program has been damaged for some reason
Added the ability to export data to Excel file format
Added the ability to save the results in multiple files, if there are too many results. For example, you can specify that the file is saved in one of 10,000 lines (supported range of values 1 - 1,000,000) and we get the results - main file "Results.(txt|cvs|xlsx)" and more, automatically generated files "Results_XXXX.(txt|cvs|xlsx)" for each additional 10,000 lines
Greatly improved algorithm for traversing large sites containing millions and tens of millions links
Various fixes/additions based on your feedbacks

v3.2 (Released 30.12.2014):

Added an option “Remove duplicates” for phone and faxes
Principe of extracting links and domains was changed (in case check-boxes “URLs” and “Domains” were marked in session's form) for “Search Engines” mode. Now these lists include urls on websites and their domains, which have sought-for keywords. Before this list consisted of all founded urls, which made it not useful in “Search Engines” mode
We have increased the maximum depth of search on websites from 10 till 100
We have improved parsing of emails, which are protected by javascript, we have added algorithm for decryption of new kind of emails protection
Added the possibility to search in local website copies on the disk. For example, using this way "c:\inetpub\wwwroot\spadix". We can set it up as “Start URL” in “Site” mode as well as in the links file for “Url List” mode
We have increased the stability of programs' work. Now in cases of abrupt computer reload or system's breaking, all datas collected during one session will be saved. Auto-saving works within 30 seconds gap
Now for program work .Net Framework 4.0 is required (Client Profile or full profile)
SQLire library is upgraded to the last version
Fixed crashes in some cases when building in “Custom Data Editor”

v3.1 (Released 05.09.2014):

Added the ability to edit url and email filters in stopped session, and then to continue with already edited filters
Added the ability to download the list of proxy servers from the text files (*.txt). Also we added support of files with format like “host:port”
Added progress in percents for requests. Now the list of requests updates very quickly
Added the name of proxy, through what the request to field “Title” is sending (for running requests only)
Improved the dispatch on proxies – now with a big list of proxies it works much more efficiently

v3.0 (Released 23.06.2014):

Added support of working with proxy servers' list
Various small fixes/additions based on your feedbacks

v2.3 (Released 08.01.2014):

Added the ability to retrieve data with preservation of custom HTML markup (checkbox "Remove HTML tags")
Improved extraction of meta tags
Fixed errors when exporting to csv
Various small fixes/additions based on your feedbacks

v2.2 (Released 15.05.2013):

Bug fixes and improvements on customers requests
"Remove duplicates" option added to "Email Filter"
New regular expressions builder for custom data search. Now you can choose one or two similar text blocks. It works even with one block chosen. It can be useful when you need to extract company details, for example address, phone numbers etc and the webpage has information only for one company.

v2.1 (Released 27.12.2012):

Significantly enhanced phones and faxes parser
Additional filters for phones and faxes are added. Now you can indicate which figures should be comprised to the phone number as well as maximum length of phone/fax number
Now you can upload/download session settings to/from the file
Command line arguments support is added: you can start session from the saved setups and indicate the file for input records
We added a possibility to make “advanced” custom data results merge – now you can gather structured data from online shops and other places

v2.0 (Released 29.08.2012):

Visual expression builder - it was never that easy to configure your custom extraction expressions
New updated help with a lot of use cases and examples
Merging results when saving to a file or copying to the clipboard
Great results filtering option
New feature - "Collect Domains Without Emails"
Many visual and engine changes/fixes

v1.2 (Released 07.06.2012):

Ability to scan RSS feeds is added
Program sustainability to the physical damage of the database is added
Improved streams control, which has a positive impact on the overall performance
Ability to determine such types of email adresses as info[at]mail.com and info(at)mail.com is added
Decoding of the hidden email adresses using java script is added
Scan time is now displaying without a split seconds and includes a days indication
Improved work with a large list of keywords in "Search Engines" mode
Added quotes support in keywords to search for exact phrases and words in Google
Reworked the algorithm for determining the depth of scan (Url Depth)
Improved filter to screen out potentially incorrect phones and faxes
Added ability to set the "Fixed Number Pages" in "Search Engines" mode
Added ability to define tag "<base href .."
SQLite engine updated
Search for new countries added: Arabia, Argentina, Chile, Philippines, Singapore. Also checked and corrected the existing list of search queries
Various fixes

v1.1 (Released 04.04.2012):

Significantly improved recognition of phones and faxes
Updated list of search engines. Added specific Arabic, Chinese and other search engines
Added extended copy of results to the clipboard
Design completed: aligned positions / sizes of controls, buttons now have a new style
Various fixes

v1.0 (Released 12.03.2012):

Completely new powerful spidering engine
Completely reworked UI - slick & sexy
Pro version of WDE doesn't have any limits - feel free to process thousands of sites, gigabytes of data
Extremely fast search and accuracy
Extract any data you want by Custom data extraction
Robust .Net engine keeps searching for days without interruptions
New session management allows you manage huge amount of data
Brand new simplified user interface