Dissertation Journal: Defended, Edited, Submitted, Accepted

It’s been about a year-and-a-half since my last post about my dissertation.  Two weeks ago, I defended my dissertation NON-RESPONSE BIAS ON WEB-BASED SURVEYS AS INFLUENCED BY THE DIGITAL DIVIDE AND PARTICIPATION GAP.  I’ve included the abstract below if you’re interested in its content but I’ll focus here on some of the process.

Continue reading Dissertation Journal: Defended, Edited, Submitted, Accepted

Inserting Unique Survey IDs into Multipage Paper Surveys

I still believe in paper surveys.  I believe that their immediacy and accessibility makes them very well-suited for some situations.  Although I value technology-based surveys (e.g. Web-based, tablet-based) I definitely believe that there are times when paper surveys are superior.

You can imagine that I was very happy when my new employer approved the purchase of (a) a printer with an automatic duplex scanner and (b) an installation of Remark Office OMR 8.  These two tools together will allow us to conduct paper surveys with some level of ease, automation, and accuracy.  I’m particularly happy that this will allow us to break free from the tyranny of Scantron by allowing us to create customized survey instruments that don’t rely on generic Scantron answer forms.

Now that I am learning how to use Remark Office OMR 8 I am figuring out all of those little things that I was previously able to count on other people to do, often without even knowing that it was being done.  Most recently, I had to figure out how to add unique survey IDs on a multipage survey.  Let me break it down for you:

I have a survey that is six pages long.  On each page, I have the page number and I can tell Remark Office where that page number is so I don’t have to worry about keeping pages in order.  But I also need some way to link all of those pages together when I am scanning multiple surveys so the correct six pages are grouped together in the resulting data file.  Hence I need to add a unique survey ID to each page of each survey.  Adding page numbers is easy but how do I add survey IDs?

I had to do this for my dissertation instrument but that was a one-page instrument so this was a simpler process.  The multipage process took me a few hours to figure it out and here is what I have settled on for now:

  1. Create the survey instrument.  I did this in Microsoft Publisher because it was the desktop publishing tool I had at hand.  I suppose you could use Word or something similar but it won’t give you near as much control over the layout.
  2. Print or save the survey as a pdf.
  3. Use that pdf to create another pdf with multiple copies of the survey instrument.  Right now, this is the clunkiest part of this process as I haven’t yet figured out how to directly print multiple copies of the instrument as a pdf.  Instead, I have to save multiple copies and merge them together.  It’s not entirely horrible as the merges geometrically multiply so it quickly becomes easy to make a single pdf file with many, many copies of the survey instrument.
  4. Create a simple Excel spreadsheet with the sequence of survey IDs.  My survey instrument has six pages so I end up with one column of numbers where each number is repeated six times before being incremented to the next one.  This spreadsheet is used in a mailmerge so I suppose this could easily be done as a comma-separated file or in some other program that produces similar output.  It’s important that the number of survey IDs match the number of surveys in your pdf.
  5. Create a simple Word document whose only text is a merge field that will insert the survey IDs into the document.
  6. Merge the Word document and save or print the resulting file as another pdf.  You now have two pdf files with the same number of pages; one has survey instruments and the other has survey IDs.
  7. Use pdftk to add the survey ID pdf as a background to the survey instrument pdf.  pdftk is a simple command line tool that lets you manipulate pdfs.  It’s freely available for many platforms, including Windows.  I used the “multibackground” parameter to essentially merge these two pdfs into one, adding the survey IDs to the survey instruments.  I got lucky in that my survey IDs were well-aligned with my survey instrument but you might have to modify one or both of your documents to get the survey ID to end up where you want it.

Now that I have unique survey IDs for each survey and page numbers on each page, I can feed the surveys into the scanner in any order I want and everything will work!  I just have to ensure that they’re all right-side up because I don’t know how well Office Remark OMR 8 can detect and correct for upside down instruments (it’s a feature of the software but I’ll have to test it; if this were a real concern I’d be looking into possible solutions such as cutting off or rounding one of the corners but I’ll be working with small enough batches that it will be easier just to flip through the completed instruments).

Item Non-response and Survey Abandonment SPSS Syntax

I don’t often write about what I do in my day-to-day job.  But I’ve recently spent quite a bit of time working on survey item non-response and survey abandonment and I want to save you some time if you’re working on those issues, too.

One of the projects on which I’ve worked over the last couple of years is the development of an updated version of the National Survey of Student Engagement (NSSE) survey instrument. We’ve done a lot – a LOT – of work on this.  As part of this work we’ve pilot tested the draft versions of the new survey.  Some of the many things we’ve analyzed in the pilot data are item non-response and survey abandonment.  I worked on this last year with the first pilot and when I worked on this again with this year’s pilot I got smarter.  Specifically, I wrote an Excel macro that generates the SPSS syntax necessary to analyze item non-response and survey abandonment.

As described in the Excel file, this macro takes a list of survey variable names and creates SPSS syntax that will add several new variables to your SPSS file:

  • A “Abandoned” variable indicating the last question the respondent answered if he or she abandoned the survey. If the respondent didn’t abandon the survey, this variable will be left empty (“SYSMIS”).
  • For every variable, a “SkippedItem__” variable indicating if the survey item was answered, skipped, or left blank because the survey was abandoned.
  • A “SkippedItems” variable indicating the total number of questions the respondent skipped.
  • A “SkippedPercentage” variable indicating the percentage of questions the respondent skipped.
  • A “AbandonedPercentage” variable indicating the percentage of questions the respondent did not answer because he or she abandoned the survey.

I created this macro because there were several versions of the pilot instrument.  Because you have to “work backward” through each question to identify respondents who abandoned the survey, each version of the instrument required a different set of SPSS syntax because each version had a different set of survey questions.  So it was much easier for me to write a program that generates the appropriate syntax then to do it by hand multiple times.  Laziness is a virtue.

Warning: This macro generates a lot of syntax.  The sample input has only four variables but it creates code with 105 lines (including blank lines and comments).  The surveys with which I was working had 130-160 variables and I worked with 11 different versions of the survey instrument.  In the end, I had an SPSS syntax file with tens of thousands of lines of code.  The SPSS syntax editor got very grumpy and slow, probably because of the large number of DO IF conditionals and the syntax highlighting it applies to those blocks of code.  I ended up working mostly in Notepad as I was troubleshooting the syntax and pasting the resulting text into the SPSS syntax editor only when I was ready to run it.  The good news is that the syntax is actually very straight-forward and arithmetically simple so it ran fairly quickly.

I know that this fills a very, very small niche.  But maybe someone will find this helpful or useful.  I spent a few days working on this so there’s no reason why someone else should have to redo this work.

Warning 2: I used this macro again a few years later and noticed that it’s set up to only deal with numeric data. If you have any string data then you’ll need to modify it accordingly.