Home / VOLARO Insight / VOLARO Insight
VOLARO Insight has developed advanced undetected data harvesting and extraction platforms to provide you with:
- the data you need from any open web source
- at the time you need it
- with your required structure
Whether you are a leading portal, a directory, an advertisement or PR agency, a homeland security agency or a vendor, VOLARO Insight will provide you with our B2B services or products to allow you undetected smooth, focused and timely data flow of relevant data to your organization. Our platforms support data extraction from thousands of sites in parallel, while the data is being filtered, analyzed and structured, based on your needs. We constantly track any change on your web sources and update your data base accordingly.
VOLARO Insight technology is based on many years of R&D experience with our partners and customers, and includes advanced undetected algorithms and tools for data extraction from any website. Our products are based on advanced distributed crawling and site harvesting. The data can be extracted based on fields, keywords, or any combination. The scheduling, pace and frequency are defined by the customer, as well as the mapping and the data base schema.
We use data validation and anomaly algorithms, and then filter and pack the data based on the users requirements.
Our engines provide data extraction, while simulating real user browsing. In order to enhance the user-like browsing, we mimic various users' profiles in parallel from various IP addresses, and create browsing noise.
The Crawling engine overcomes anti-bots barriers, and overcome sites that use blocking technologies.
Our setup time for new services is fast, and the system adapts itself to structural changes in the websites, such as identifying new fields or loss of existing fields.
Our offer products are:
The VOLARO extract is a full outsourcing service that enables customers to collect high-quality data
from the web, in a simple, fast and cost effective manner.
The extract enables crawling the web, Capturing data, processing, tracing and collecting of
business specific data, analyzing it, generating alerts and reacting according to predefined
settings executed by the business rules.
The service includes:
Defining the target sites with the costumer
Collecting the data on an agreed interval
Mapping the data to a defined Schema
shipping the packed data in any format
The extract collects data from any web source including: news articles, classifieds, directories, blogs,
forums, Social networks, talkbacks, video sharing and pictures. The Focalextract High performance crawling infrastructure can scan up to 100 pages per minute (and scalable, depending on the # of crawlers used). The crawling frequency defined based on the
customer preferences and up to twice a day per site.
The extract data capturing module collects data from Defined fields from any type of page.
The data collection module enables adding logic allowing data filtering by criteria constructed of
a collection of words and phrases.
The extract data processing module analyzes the data both on a site by site bases and on
a accumulative level. Providing the ability to identify the collected data state whether it is New,
Updated, Expired or omitted from the site. Additionally it can exclude duplicates from multiple
The extract enrichment module is part of the extended service and not included in the
basic service. This module executed data transformation and data enrichment. The main reason
customers choose this module is in cases where the data is being used by internal applications
and they prefer to transform and enrich the data before they receive the data eliminating the
need to manipulate the data before uploading it onto their systems.
The extract file formatting module generates the files to be delivered and uploaded onto
the customers systems. Using this module you can specify in which format you desire to receive
XML, Excel, etc… what will be its internal structure and what will be its content: single record,
whole site, all sites or any other preference.
The Delivery module manage the file delivery to the customers systems using common protocols
such as email, ftp, or any other communication means by customer choice.
Setup & initialization
VOLARO Insight utilizes a methodology we have developed to ensure fast, simple and accurate setup procedure.
This is done together with our representative to reduce complexity and simplify the process. Normally the setup stage is finalized within a single meeting.
intelligent Web crawling system
Robust crawling and data collection modules
Flexible crawling scheduling
Full browser simulation
Dynamic IP support
Simulates client-side cookies
Tracing technology supports java script, Ajax, Flex, Flash, multiple frames, Jpeg data, Captcha in
certain noise level
Robust, cracks any site, any resistance
Fast site setup and easy adaptations to changes
High performance - x,000 of sites per hour
Deep Web crawling: can crawl and extract data from any level of a web site, also from pages which are
not indexed by search engines.
Experienced with various types and large numbers of web sites
Collect is an advanced Internet OSInt solution, enabling security and Federal agencies
for self definition of data collection from the web. Access to web sources, content
management systems, forums, blogs and any public web site is a major demand by federal
agencies to effectively perform their mission to track their enemies and threats.
VOLARO Insight is a market leader in OSINT undetected Data Collection and the pioneer of full
in-house self operated solutions for data collection. For the first time, VOLARO Insight provides
security & Intelligence agencies with Focalcollect, a solution that is self operated within
their IT environment, including:
Collect is an advanced Internet OSInt solution, enabling security and Federal agencies for self- definition of data collection from the web. Access to web sources, content management systems, forums, blogs and any public web site is a major demand by federal agencies to effectively perform their mission to track their enemies and threats.
VOLARO Insight is a market leader in
OSINT undetected Data Collection
and the pioneer of full in-house
self-operated solutions for data collection.
For the first time, VOLARO Insight provides security & Intelligence agencies with collect, a solution that
is self-operated within their IT environment, including:
new sources definition
Online 24/7 data collection
Data packed and delivered to external systems based on
specific structure and relationship as defined by the user
Robust and surgical extract of web data without predefined coding or scripting
Undetected innocent data collection, mimicking real browser user behavior
Collect allows homeland security and intelligence agencies, governments, military and police,
financial and civilian agencies with advanced capabilities to extract web data from any source with
the highest level of accuracy. The results are fed into any data standard or customized structure,
application, search tool, or hardcoded reports or analytics systems.
The impact is an advanced platform for online monitoring of the publicly available Internet
network, aimed for advertising and marketing managers, political campaign managers, as
well as for Intelligence and information investigators.
The impact provides a few steps of flow starting from data collection, data analysis,
alerting and online responding tools.
The impactis a B2B solution, allowing different levels of use for different management
levels and various functions within the organization, with a capability of data transfer to
the organization IT systems.
The impact system collects all relevant data, as defined, and provides you with a GUI with a
snapshot of the monitored campaign, or your other subject of interest, including statistics and
sentiment analysis, as well as detailed information for each piece of information collected.
The collected data includes any required information from any type of open source web site:
news sites, blogs, forums social networks and others. FocalImpact also monitors users’
web chatter including talkbacks and social network activities. All collected data is stored and
analyzed, allowing the user to define his look & feel.
The impact is designed to support a variety of users' levels, each with his set of functions, supporting his profile needs, in line of his responsibilities at the organization
Available user profiles:
Executive – define subjects, view data using a predefined Dashboard views and drill down the data.
Power User – All the Executive functionality plus the right to define new Dashboard, views and execution of
advanced data analysis queries.
Marketer – All the Power user functionality plus management of business rules and alerts setting and
creating responses to data published as part of the organization official and unofficial web activities through:
blogs, forums and talkbacks.
Alerting and Responding
The impact allows users to define alerts, based on various rules, as derived from the collected information
and subjects. Alerts are delivered in real time using FocalImpact alerting system, to email or through SMS.
When a user gets an alert, he can activate the respond functionality, allowing him to participate in the discussion, and to be on line with his on line customers. The respond functionality can be created automatically based on the alerting messages, or manually with the user inserting a message by himself.
Monitor the web buzz of your brands, markets, competitors and subjects of interest specifying
SUBJECT including keywords and users to watch.
View & analyze the buzz using a dashboard with readymade templates.
Receive alerts on business critical events.
Respnd to data posted in line with the monitored data and activities
Access from anywhere, no need to install available as a Software as a Service (SaaS) solution.
Rich data monitoring and tracking facilities
Real time alerts
Access data both on a statistical level and on single data Item
Monitor both data and users
Web based SaaS solution – no need to Install and maintain, can be used anywhere