Post Pic

Mozenda Data Scraping with CodeIgniter

Mozenda is a very powerful data scraping service. If you have ever found yourself writing scripts or manually copying and pasting data from one website to another then mozenda is for you. They have a very nice, full featured REST API which will be the focus of this article.

Overview

I use Mozenda all the time. When scraping crappy static html pages it is very difficult to find consistency with markup. So thats where Mozenda comes in. I create an agent to scrape a collection of data in a specific order. It can select data between html tags, select data thats supposed to be in one featured place on the page, or it can skip over it if the data you are looking for is not there. It is truely a lifesaving service for anyone that updates mass amounts of data on a regular basis. You cant beat the price of $49/month either (just think how much you would have to pay someone to scrape all of that data for you). I have created a CodeIgniter Library (regular php class compatible) to interact with this api in a easy way. Now lets get to it!

Download

Mozenda CodeIgniter API Library ( v1.0 )

Documentation

First off you need to set your api key and set what format you would like the results to be returned to you in. If you don’t have an api key then

1. Login to your account at http://Login.Mozenda.com

2. Click the ‘Account’ link located in the top right corner of the web page.

3. Look for the ‘API Web Service Key’ section under the ‘Account Details’ tab, then click ‘Generate a New Key’. You will be required to provide this key in all requests to the API.

If you already have your API key then all you need to worry about is setting your output format. The two options are array (php array) and json.

Configuring Your Library


$config_array = array('output_format' => 'json', 'api_key' => 'MY-SUPER-SECRET-KEY');
$this->mozenda_api->config($config_array);

Collection.GetList

Returns a list of collections for an account


$data = $this->mozenda_api->collection_get_list();

Collection.GetViews

Gets a list of views for a particular collection


$data = $this->mozenda_api->collection_get_views($collection_id);

Collection.GetFields

Returns a list of fields that are in that collection with their details


$data = $this->mozenda_api->collection_get_fields($collection_id);

Collection.AddItem

Adds an item to a collection with the values specified.


$items = array('Username' => 'John', 'Phone_Number' => '555-0123');

$data = $this->mozenda_api->collection_add_item($collection_id, $items);

Collection.UpdateItem

Updates an item in the collection.


$items = array('Username' => 'Peter', 'Phone_Number' => '555-9876');

$data = $this->mozenda_api->collection_update_item($collection_id, $item_id, $items);

Collection.DeleteItem

Deletes an item from a collection.


$data = $this->mozenda_api->collection_delete_item($collection_id, $item_id);

Collection.Clear

Clears the contents of a collection but leaves the collection intact.


$data = $this->mozenda_api->collection_clear($collection_id);

Collection.Delete

Deletes the collection and all data within it.


$data = $this->mozenda_api->collection_delete($collection_id);

View.GetItems

Returns items from a view.


$data = $this->mozenda_api->view_get_items($view_id);

Agent.GetList

Returns a list of your agents with their ID, Name, Settings, Description, and other important information.


$data = $this->mozenda_api->agent_get_list();

Agent.GetJobs

Returns a list of your agent’s jobs with detailed information.


$data = $this->mozenda_api->agent_get_jobs($agent_id);

Agent.Run

Starts or resumes the Agent.


$data = $this->mozenda_api->agent_run($agent_id);

Agent.Delete

Deletes an agent and all associated schedules for that agent.


$data = $this->mozenda_api->agent_delete($agent_id)

Job.Get

Gets the details of a job by the Job ID.


$data = $this->mozenda_api->job_get($job_id);

Job.Cancel

Cancels a Job in the system. Note, a job must be in a Paused or Error State to cancel a job.


$data = $this->mozenda_api->job_cancel($job_id);

Job.Pause

Issues the ‘Pause’ command for a job currently running in the system.


$data = $this->mozenda_api->job_pause($job_id);

Job.Resume

Resumes a job that is in a Paused or Error state.


$data = $this->mozenda_api->job_resume($job_id);

I would really recommend taking a look at the official documentation on the mozenda website show here.

That’s It!

That pretty much wraps it all up. If you have any questions or find a bug please use the comments and ill do my best to fix it.

NOTE: The descriptions of what the commands do was copied from the mozenda documentation to make it easier for the end user to understand. I do not take any credit in writing the descriptions.

Related Posts

Envato Marketplace API with CodeIgniter Envato recently released an API for thier various Marketplaces. I... How to Structure Models in your MVC based Web Application MVC has defiantly changed the way applications are written over...

Poular Posts


5 Responses

04.26.09

Great writeup! Thanks for using us!

04.26.09

Great product! We would be lost without it.

[...] “Mozenda is a very powerful data scraping service. If you have ever found yourself writing scripts or manually copying and pasting data from one website to another then mozenda is for you. They have a very nice, full featured REST API which will be the focus of this article.”  Read more… [...]

04.26.09

What I think is great is how much raw information Mozenda can scrape for you for such a cheap price. It’s a win-win deal.

Yes. Interesting stuff Tom. I use Mozenda quite frequently to aggregate and analyze data. It’s always interesting to see how others are using it.

Leave Your Response

* Name, Email, Comment are Required