Mobile Data Collection with Preloaded data

Background:

Recently I was asked to design a mobile based data collection and management system to collect data of sugarcane growers in Pakistan. The objective was to develop a data collection and management system which can be used to collect data from sugar cane growers using android based devices. The collected data required to contain text, numeric and pictures along with geographic shapes (polygons) of the sugarcane fields.

The field enumerators were expected to have low education level so it was desired to make the mobile data collection as intuitive and user friendly as possible. The weather and field conditions were tough requiring considerations.

The collected data was to be audited by supervisors through a web interface where they wanted to view and verify the collected data as well as the location, shape and size of sugarcane field. They wanted to view the polygon of sugarcane filed overlaid on satellite image with the ability to modify coordinates of polygon shape.

Additionally, it was required to have some mechanism in which existing data of sugarcane growers is made available on mobile data collection devices so that enumerator does not need to fill in all data fields. Instead they can simply verify if the existing data was correct.

Form Design with Preloaded data:

We decided to base our work on Open Data Kit with customized data collection forms and reporting server. ODK Collect 1.4.3 allows the data preloading in new round of survey. We took advantage of that and created a survey form with associated database of existing grower information. Some key technical aspects in designing such forms include the following:

  1. Create a .csv file containing the data you want to use as pre-loaded in your questions. For example our csv name is SCGDV1.csv
  2. The .csv file must contain a column with name ending with “_key”. This column will be used for lookup. For example in our case we used “grower_id_key”
  3. The column names for other columns should also be short and unique.
  4. Create a simple form using ODK build or any other xml form builder of your choice
  5. Open the xml for in note pad or any other xml editor for advance changes
  6. Search for  “<bind nodeset” and you will reach in the part of form containing data nodes
  7. Initially they will look like:
    <bind nodeset="/data/grower_id" type="int"/>

    <bind nodeset="/data/name" type="string" required="true()"/>

    <bind nodeset="/data/father_name" type="string"/>

    <bind nodeset="/data/nic" type="int" required="true()"/>

    <bind nodeset="/data/land" type="int"/>

    <bind nodeset="/data/location" type="geopoint"/>

  8. Add pulldata() function to desired nodesets where you want to have preloaded data.
  9. The syntax will be like
    calculate="pulldata('SCGDV1', 'name', 'grower_id_key',  /data/grower_id)"

    calculate=”pulldata(‘SCGDV1’, ‘name’, ‘grower_id_key’,  /data/grower_id)”

  10. Where SCGDV1 is name of csv file, “name” is the column heading whose value you want to pull against grower_id given in /data/grower_id while “grower_id_key” will be used for searching that name.
  11. Suppose you entered 15 as a grower_id in a question and you use pulldata() function to fetch name of the grower having id 15 from the csv file. So it will search for 15 in “grower_id_key” column and will find the corresponding name for that record and fill the Name question with what it found.
  12. pulldata() function is used with a calculate command with each nodeset and resultantly code looks like this
    <bind nodeset="/data/grower_id" type="int"/>

    <bind calculate="pulldata('SCGDV1', 'name', 'grower_id_key',  /data/grower_id)" nodeset="/data/name" type="string" required="true()"/>

    <bind calculate="pulldata('SCGDV1', 'father_name', 'grower_id_key',  /data/grower_id)" nodeset="/data/father_name" type="string"/>

    <bind calculate="number(pulldata('SCGDV1', 'nic', 'grower_id_key',  /data/grower_id))" nodeset="/data/nic" type="int" required="true()"/>

    <bind calculate="number(pulldata('SCGDV1', 'land', 'grower_id_key',  /data/grower_id))" nodeset="/data/land" type="int"/>

    <bind nodeset="/data/location" type="geopoint"/>

  13. Even when they are numbers, data fields pulled from a .csv file are considered to be text strings. Thus, you may sometimes need to use the int() or number() functions to convert a pre-loaded field into numeric form. In my case int() did not worked but number() works fine as it can be seen above. I had to use this function for each and every nodeset where data type was integer. Otherwise it gives error.
  14. Once form is complete, test is using ODK Validate and upload in your aggregate along with csv file. Deploy on your mobile and it works perfect.