everything Paglo
Paglo Crawler
Paglo is now available for everyone! And you can set up an account for FREE. Download the open source Paglo Crawler, activate your account, and our systems will instantly set you up (with no human intervention).
Once you complete the download, you can add or remove plugins that extend your Crawler, or utilize our code to build your own Crawler agents. The Paglo Crawler runs on Windows XP, Vista and 2003 Server. See below for Linux support.
Overview
The Paglo Crawler is an open source supersearcher — an agent that probes your network for devices and other IT assets, and discovers everything about them. The Crawler is part of Paglo, the first search engine for IT, a tool that specializes in searching the complex and varied data of IT networks, and in returning intelligent data in both simple text and rich quantitative form. The data that the Crawler finds is visible through a secure Paglo Web account. A single Paglo Crawler can be installed to probe an entire enterprise network.
Paglo finds what no other search engine can find: IT assets.
That includes the characteristics of devices on the network, such as type,
configuration, and what other devices are connected. That kind of
information about a network isn't collected neatly into documents
that you can hunt for. If it was, Google could find it. But you know
how it is — as soon as you document something in IT, it changes!
Paglo finds this kind of information by sending the Crawler through the network and communicating with each device. No need for hours of manual data entry to document. And Paglo works continuously. The Paglo Crawler does its crawling over and over again, which keeps the data it collects fresh. It automatically self-updates.
The Paglo Crawler stores the information that it gathers throughout your network in your private Paglo Search Index, in a separate and unassailable location that only you have access to. When you log in to your Paglo account, you get all the data that's yours, and only the data that's yours. And you're the only one who has access to it. Your login is the key — no one else can see your data unless you give them your login password.
We have also made the Crawler extendable so you can develop your own plugins to collect additional IT data or use existing add-ons that are available to the community.
Features of the Paglo Crawler:
- Open Source — Anyone can download the source code from our Subversion repository
- Extendable — We encourage you to develop plugins to extend the type of IT data that the Crawler can capture
- Powerful — Employs unique scanning and probing techniques for discovering devices and other IT assets, and identifying their unique characteristics
- Easy — Installs in minutes via free download, and provides rich and sophisticated data visible in a browser through a secure Web account
Linux Instructions
We provide Debian packages for Ubuntu 8.04LTS and Ubuntu 9.04. Ubuntu 8.04LTS corresponds to the Linux kernel 2.6.24-24. Ubuntu 9.04 corresponds to the Linux kernel 2.6.28-13.
Building
For i386/AMD based systems or you can build the crawler from the public subversion repository at: https://svn.paglo.com/paglo_open_source/crawler/branches/multi_process.
Prerequisites
The following debian packages must be installed to successfully build the Paglo Crawler:
- libssl-dev
- libpcap-dev
- ruby-dev
- libxml2-dev
- libopenssl-ruby
- libcurl4-openssl-dev
- fakeroot (to build the Paglo Crawler .deb)
Building SNMP++ 3.x
First we build 'snmp_pp'. This does not have a 'configure' script unfortunately so you will need to edit:
multi_process/vendor/snmp_pp/include/snmp_pp/config_snmp_pp.h
You need to uncomment the lines:
#define SNMP_PP_NAMESPACE
and
#define _USE_OPENSSL
Then compile it:
cd multi_process/vendor/snmp_pp/src/
make -f Makefile.linux
Since we do not install 'snmp_pp' it is best to have the crawler built with the static version of the library, so we remove the shared version that was also built.
rm multi_process/vendor/snmp_pp/lib/libsnmp++.so
Building Google Breakpad
Next you need to build google breakpad. Luckily this is a lot simpler process:
cd multi_process/vendor/google-breakpad/
./configure
make
Building the Paglo Crawler
Now you can build the crawler itself. It uses 'curl' and 'libxml2'. The configure script will attempt to find those using their "-config" programs if you have their packages installed.You do need to tell it where to find the google breakpad code and the snmp_pp headers and libraries. You need to supply the absolute path on the configure line:
cd multi_process/src/
./configure --with-breakpad-headers=<checked out source>/multi_process/vendor/google-breakpad/src \
--with-snmp-libs=<checked out source%gt;/multi_process/vendor/snmp_pp/lib/ \
--with-snmp-headers=<checked outsource>/multi_process/vendor/snmp_pp/include/
make
After this is done you will have a built crawler.
Installing the Paglo Crawler
There is no 'make install' target yet so you will need to install it manually (presuming that you do not want to run it out of the directory it was built in)
Here are the commands that will make a tar-ball that you can untar where you want to have the crawler actually live:
cd multi_process/src/
cp paglo_logging.conf-unix paglo_logging.conf
mkdir compressed_files
mkdir dumps
mkdir file_submissions
mkdir submissions
tar --exclude .svn --exclude \*~ -c -f /tmp/paglo_crawler.tar crawler_executive crawler_main Plugins paglo_logging.conf compressed_files submissions file_submissions dumps
NOTE: The "make tarball" command will tar up just the crawler binaries and Plugins directory which is useful for installing upgrades.
Configuration the Paglo Crawler
You will need two configuration files:
crawler.conf
credentials.conf
We separate credentials you use for authentication out in to a separate file as part of how we do security. The contents of the credentials file you should make sure only root has access to. This file is never sent up to Paglo.
The credentials.conf file should look something like:
snmp_credential=<name>,<ro snmp community string>
telnet_credential=<name>,<password>,<enable password>
ssh_credential=<name>,<userid>,<password>
data_key=<a long hexadecimal key>
You can have multiple snmp, telnet, and ssh credentials. Just give them different names. The name is not important to us but for you to give some handle to each set of credentials.
The 'data_key' is the data submission key that gives the crawler access to submit data to your company's index. You get that from going to this page:
https://app.paglo.com/user/edit_company
Note the section on the page that says 'Data Key' followed by a hex
string.
The crawler.conf file can start off really simple. You need to declare what interface the crawler is listening on. Almost everything else can be configured through the crawler's web based UI that is part of Paglo:
iface=eth0
scan_threads=10
promisc=true
Unpacking the Paglo Crawler tarball
To 'install' and start the crawler make the directory where you want to install it. We use:
mkdir /opt/paglo/paglo_crawler-3.0/
Copy your credentials.conf and crawler.conf file to this
directory.
Make sure that only root has read access to your credentials.conf file.
Untar the tarball we made earlier in this directory:
cd /opt/paglo/paglo_crawler-3.0/
tar xvf /tmp/paglo_crawler.tar
Running
and away you go:
cd /opt/paglo/paglo_crawler-3.0/
sudo ./crawler_executive
The "crawler_executive" will immediately run as a daemon (which means that the command you just ran will return immediately.)
If you wanted to run the "crawler_executive" in the foreground and not have it become a daemon you would add the '--app' command line argument:
sudo ./crawler_executive --app
At this point if you do a "ps auxww | grep crawler_" you will see a list of processes. One "crawler_executive" process, and several "crawler_main" processes.
You will not see all of the "crawler_main" processes immediately. The "crawler_executive" will start one every 30 seconds until all the ones it is configured to run have started.
If you have any questions or something does not seem right please do not hesitate to send an email to 'support@paglo.com'.

Download Now
