Using ODK Briefcase to fetch data from Aggregate Server

This tutorial has been written during the implementation of a mobile based data collection and management project in Charsadda, Pakistan.

ODK Briefcase is an integral part of mobile based data collection and management system. We have been using this simple but very useful tool to push and pull the collected data for our aggregate servers (a web based server holding data being sent from mobiles). ODK Briefcase can also be used to fetch data from ODK Collect when you are offline.

ODK Briefcase can be used for:

1. Export forms with submissions from ODK Aggregate – PULL

2. Make bulk submission to ODK Aggregate – PUSH

3. CSV exports of submission data

4. Pull forms and submissions collected by ODK Collect from a mobile phone

Use ODK Briefcase to fetch data from server:

1. Install Java 6 or higher on your computer.

2. Download ODK briefcase from http://code.google.com/p/opendatakit/ and open it.

3. Specify the location of the ODK storage area on your computer this will create the ODK Briefcase Storage folder which will hold all the blank forms and finalized forms. For example, we kept it to H:\CharsaddaData. It will automatically create a subfolder named ‘ODK Briefcase Storage’ there

image

4. On the tab “Pull,” click on the bar to the right of “Pull data from:” (top left) and choose “Aggregate 1.0”.

5. Click on “Connect” to pull data from aggregate 

image

6. Specify the server url, enter the username and password to the server account. Click “Connect” to pull forms from server. In this case server address will be https://citypulsedata.appspot.com

image

7. In the main screen, find the form you want to download and tick off on the check box to the left of it. In Charsadda project case, you need to select ElectricityConsumer_RCA_F.

image

8. Hit “Pull” button at bottom right of page. It will start fetching the form and you will be able to see the progress. ODK Briefcase automatically see if any form is already downloaded in the given folder. In that case it will quickly go through those form and will fetch only the new records.

clip_image015

9. Wait until you see the “SUCCESS!” message on the “Pull status”. 

image

Using ODK Briefcase to export data in csv format:

1. Go to “Export” tab. Pull down the bar to the right of “Form:” (top left) and choose the relevant form. In case of our example it will be ElectricityConsumer_RCA_F

image

2. Click on the “Choose” button to select the export directory. For example in current case its same H:\CharsaddaData

image

3. Click “Export” button on the bottom-most right of the application. This will load quickly and will give you the same “SUCCESS!” message.

4. Go to the directory where you placed the data (e.g. H:\CharsaddaData) and you will find a folder title media along with a csv file titled same as the form (e.g. ElectricityConsumer_RCA_F.csv). Use this csv file along with media folder for any further operations if necessary.

Mobile Data Collection with Preloaded data

Background:

Recently I was asked to design a mobile based data collection and management system to collect data of sugarcane growers in Pakistan. The objective was to develop a data collection and management system which can be used to collect data from sugar cane growers using android based devices. The collected data required to contain text, numeric and pictures along with geographic shapes (polygons) of the sugarcane fields.

The field enumerators were expected to have low education level so it was desired to make the mobile data collection as intuitive and user friendly as possible. The weather and field conditions were tough requiring considerations.

The collected data was to be audited by supervisors through a web interface where they wanted to view and verify the collected data as well as the location, shape and size of sugarcane field. They wanted to view the polygon of sugarcane filed overlaid on satellite image with the ability to modify coordinates of polygon shape.

Additionally, it was required to have some mechanism in which existing data of sugarcane growers is made available on mobile data collection devices so that enumerator does not need to fill in all data fields. Instead they can simply verify if the existing data was correct.

Form Design with Preloaded data:

We decided to base our work on Open Data Kit with customized data collection forms and reporting server. ODK Collect 1.4.3 allows the data preloading in new round of survey. We took advantage of that and created a survey form with associated database of existing grower information. Some key technical aspects in designing such forms include the following:

  1. Create a .csv file containing the data you want to use as pre-loaded in your questions. For example our csv name is SCGDV1.csv
  2. The .csv file must contain a column with name ending with “_key”. This column will be used for lookup. For example in our case we used “grower_id_key”
  3. The column names for other columns should also be short and unique.
  4. Create a simple form using ODK build or any other xml form builder of your choice
  5. Open the xml for in note pad or any other xml editor for advance changes
  6. Search for  “<bind nodeset” and you will reach in the part of form containing data nodes
  7. Initially they will look like:
    <bind nodeset="/data/grower_id" type="int"/>

    <bind nodeset="/data/name" type="string" required="true()"/>

    <bind nodeset="/data/father_name" type="string"/>

    <bind nodeset="/data/nic" type="int" required="true()"/>

    <bind nodeset="/data/land" type="int"/>

    <bind nodeset="/data/location" type="geopoint"/>

  8. Add pulldata() function to desired nodesets where you want to have preloaded data.
  9. The syntax will be like
    calculate="pulldata('SCGDV1', 'name', 'grower_id_key',  /data/grower_id)"

    calculate=”pulldata(‘SCGDV1’, ‘name’, ‘grower_id_key’,  /data/grower_id)”

  10. Where SCGDV1 is name of csv file, “name” is the column heading whose value you want to pull against grower_id given in /data/grower_id while “grower_id_key” will be used for searching that name.
  11. Suppose you entered 15 as a grower_id in a question and you use pulldata() function to fetch name of the grower having id 15 from the csv file. So it will search for 15 in “grower_id_key” column and will find the corresponding name for that record and fill the Name question with what it found.
  12. pulldata() function is used with a calculate command with each nodeset and resultantly code looks like this
    <bind nodeset="/data/grower_id" type="int"/>

    <bind calculate="pulldata('SCGDV1', 'name', 'grower_id_key',  /data/grower_id)" nodeset="/data/name" type="string" required="true()"/>

    <bind calculate="pulldata('SCGDV1', 'father_name', 'grower_id_key',  /data/grower_id)" nodeset="/data/father_name" type="string"/>

    <bind calculate="number(pulldata('SCGDV1', 'nic', 'grower_id_key',  /data/grower_id))" nodeset="/data/nic" type="int" required="true()"/>

    <bind calculate="number(pulldata('SCGDV1', 'land', 'grower_id_key',  /data/grower_id))" nodeset="/data/land" type="int"/>

    <bind nodeset="/data/location" type="geopoint"/>

  13. Even when they are numbers, data fields pulled from a .csv file are considered to be text strings. Thus, you may sometimes need to use the int() or number() functions to convert a pre-loaded field into numeric form. In my case int() did not worked but number() works fine as it can be seen above. I had to use this function for each and every nodeset where data type was integer. Otherwise it gives error.
  14. Once form is complete, test is using ODK Validate and upload in your aggregate along with csv file. Deploy on your mobile and it works perfect.

Mobile Data Collection in Pakistan

Almost every project in rural development, disaster management and community awareness calls for field surveys for the collection of primary data. In low income country like Pakistan where capacity and administrative problems with the collection of data are common, surveys are often the only way to collect reliable data. Paper based data collection has been the standard method for decades but errors are frequent, storage costs are prohibitive, and the costs of double data entry are high. Recent advancement in communication technology has introduced the electronic methods of data collection in order to merge the process of data collection and data entry. Handheld devices such as personal digital assistants and smart phones are increasingly being used instead of paper and pencil methods of data collection.

In 2008 Pakistan was the world’s third fastest growing telecommunications market. Pakistan’s telecom infrastructure is improving dramatically with foreign and domestic investments into fixed-line and mobile networks; fiber systems are being constructed throughout the country to aid in network growth. Approximately 90 percent of Pakistanis live within areas that have cell phone coverage and more than half of all Pakistanis have access to a cell phone. With 118 million mobile subscribers in March 2012, Pakistan has the highest mobile penetration rate in the South Asian region (Wikipedia 2012). This gives us a very positive opportunity to use mobile based data collection mechanisms in our regular data collection and research activities to reduce our cost and improve accuracy and efficiency.

clip_image002The concept of electronic data collection has been applied successfully in many developing countries (see Map) in the field of health, agriculture, socio-economic studies, livelihoods & economic development, microfinance, market analysis and customer satisfaction studies. Recently this data collection mechanism has been adopted in Pakistan by some national and international organizations to collect data from remote areas at a reasonably large scale.

Open Data Kit (ODK) is a suite of tools that allows data collection using Android mobile devices and data submission to an online server, even without an Internet connection or mobile carrier service at the time of data collection. One may streamline the data collection process with ODK Collect by replacing traditional paper forms with electronic forms that allow text, numeric data, GPS, photo, video, barcodes, and audio uploads to an online server. You can host your data online using Google’s powerful hosting platform, AppEngine, manage your data using ODK Aggregate and visualize your data as a map using Google Fusion Tables and Google Earth.

Created by developers at the University of Washington’s Computer Science and Engineering department and members of Change, Open Data Kit is an open-source project available to all. It consists of three main components Build, Collect and aggregate as shown below:

clip_image004

As per my knowledge, in Pakistan, Mobile data collection using Andriod based smart phones has been used partially in the following projects (as of May 2014):

  1. Multi-sector Initial Rapid Assessment for Pakistan (MIRA) implemented by OCHA and NDMA
  2. Collection of primary data about ‘elements at risk’ in flood plain areas of Indus River implemented by City Pulse (Pvt.) Ltd. (Mar 2012)
  3. Real time data analysis of Participants’ Feedback in training sessions (Jan 2014)
  4. Labour Force Survey in Gilgit Baltistan implemented by AKFP and AKRSP
  5. A pilot project on monitoring of health facilities using smart phones implemented by LUMS
  6. PakistanGIS team has been capacitating a few groups of university researchers in Mobile data collection systems and Smart phone based primary data collection for improving efficiency and accuracy in data collection for their research. (Aug 2012)
  7. IRG has used Mobile data collection for Electricity Consumers’ Census in KPK for PESCO. Mobile data collection and Management solution has been provided by City Pulse (Pvt.) Ltd. (April 2014)

Special Thanks to Mr. Qadeer for write up