SDA 3.5 Documentation for HSDA

NAME

hsda - HTML link to SDA programs

USAGE

hsda?HARC_filename+study_name
or
hsda?HARC_filename+study_name+outstudy_path
or
hsda?HARC_filename+debug (debugging mode)
or
hsda -t filename (output a language file)

DESCRIPTION

The HSDA program provides the link between an HTML document and the SDA programs available for online use on the World Wide Web. HSDA presents the user with an option screen and then executes the SDA program required to carry out the selected option.

A supplementary function of the program is to output the language strings used by HSDA to create the option screens for each of the SDA programs. If HSDA is run from the command line with the ‘-t’ flag, the program will write out all the strings into the file specified after the ‘-t’. They can then be replaced with strings in another language. HSDA will use the revised strings file to create option screens, if the name of the directory containing the modified file is specified with the ‘LANGUAGE=’ keyword in the general section of the HARC file. See the language document for details.

OVERVIEW

The HSDA program reads the HTML archive specification file (HARC file) to determine which codebooks, datasets, and analysis programs are available and where they are located.

Unless HSDA is being run in debugging mode, the program then generates an SDA option selection form, displays the form on the user’s browser, and waits for the user to select one of the available actions to take.

MODES OF OPERATION

A Study Name Is Specified
When the reference to the HARC file contains the name of a specific study (after the plus sign, as in the first model given in the USAGE section above), that study is automatically selected. The action to be taken applies only to the study that has been selected. (See Example 1 below.)
A Study Name AND an Outstudy Path are Specified
If the reference to the HARC file contains BOTH the name of a specific study (after the first plus sign, as in the second model given in the USAGE section above), AND the pathname of an SDA dataset directory (after the second plus sign), two things happen:
First, the specified study is selected. The action to be taken applies only to that study.
Second, the pathname of the SDA dataset directory is interpreted as if it were specified with the ’OUTSTUDY=’ keyword for this study in the HARC file. That dataset directory also becomes the last SDA dataset searched for variables, just as if it were specified with the last ’SDADATA=’ keyword for this study in the HARC file. (See Example 2 below.)
The purpose of this option is to allow users to create recoded and computed variables and to store them in an SDA dataset of their own. Those variables will then be available for use by any of the SDA analysis programs.
The specified SDA dataset directory must be writable by the Web server process; therefore, write permission will have to be enabled for anyone in the user group to which the Web server process belongs. That SDA dataset directory must have two subdirectories in it named ’VARS’ and ’STUDYINF’; they also must be writeable by anyone in the user group to which the Web server process belongs. Furthermore, the ’VARS’ subdirectory must contain a copy of the CASEID variable for the study; that variable must be readable by the Web server group. However, the directories that are the parents of the SDA dataset directory do not need to be either readable or writeable; but they must be searchable by the group to which the Web server process belongs, so that the Web server can access subdirectories within them.
Debugging Mode
If the HSDA program is executed with the word ‘debug’ as the first argument after a plus sign (the third model given above in the USAGE section), HSDA will check the HARC file for errors. The program performs a syntax check on the HARC file, and it checks that the files and directories specified in the HARC file actually exist and can be accessed in the appropriate mode (reading or writing). Provided that JavaScript is enabled, HSDA also tries to validate the URLs in the HARC file. After all the checking has been done, HSDA sends a report to the browser.
A HARC file can be checked or debugged from any browser by executing a URL of the following form, which uses the word ‘debug’ instead of the name of a dataset after the plus sign:
http://sda.berkeley.edu/cgi-bin/hsda?harcfile+debug
(See Example 3 below.)
Debug mode is disabled by default. To enable it, add a ’DEBUG=YES’ line in the [GENERAL] section of the HARC file. Once a HARC file is checked and debugged, this access to debug mode can be turned off by removing the ’DEBUG=YES’ line or commenting it out by placing a ’#’ in the first column of the line. (Note: debug mode was enabled by default until release #2 (May 2004) of SDA 1.3.)

ACTIONS TO TAKE

Currently, HSDA can execute any of the following actions, provided they are enabled for the selected dataset:

View the study documentation or codebook
Run analysis programs
Download pre-existing data or documentation files
Create and download a customized subset of the data file

The HTML page that calls or executes the HSDA program must always specify (as an argument) the name of the HARC file which the program is to use. The HARC file, along with the HSDA program, is assumed to be in the directory containing CGI programs. (However, the HARC file can be located elsewhere on the server, provided that the full pathname is specified as the argument for HSDA.) Under UNIX, this directory is usually named ‘cgi-bin’, although the Web server can be configured to search other directories for the CGI programs. The layout and contents of the HARC file are described in a separate document.

The HTML page that calls the HSDA program must also specify (as a second argument) the name of one of the studies defined in the HARC file. The HSDA program will then generate the appropriate SDA form for that study. That form will provide access to the available codebooks, SDA analysis programs, and files for subsetting and/or downloading.

VIEW THE STUDY DOCUMENTATION OR CODEBOOK(S)

One of the actions that the user may select is to view the study documentation or codebook for the selected study. The location of the base HTML file for the codebook must be specified in the HARC file. The same SDA XCODEBK program that generates the HTML codebook also creates the ’tree_items.js’ file used by the new SDA interface to display the variable tree.

If more than one codebook is specified in the HARC file, the user will be able to choose one at a time. Multiple codebooks are each given a label, so that the user can distinguish between them.

RUN ANALYSIS PROGRAMS

The HSDA program provides online access to SDA analysis programs by generating an HTML form, specific to each analysis program, to gather the names of variables and the options desired by the user. The specified variable names and options are incorporated into a file of batch commands. The appropriate SDA analysis program is then executed in batch mode, using the file of commands prepared by HSDA.

HSDA provides for the online execution of several SDA programs for analyzing variables, creating and listing new variables, and listing the contents of selected variables for selected cases. The program names and the pathname of the location of SDA programs on the server computer must be specified in the HARC file.

For online analysis to be made available for a particular study, an SDA dataset for that study must be located on the server computer. The pathname of the appropriate SDA dataset directory must be given in the HARC file. A single study may contain variables in more than one SDA dataset directory, provided that all the datasets have the same number of cases and that the sequence of the cases in each dataset is the same.

DOWNLOAD PRE-EXISTING DATA OR DOCUMENTATION FILES

An archive can use the interface provided by the HSDA program to distribute pre-existing data files or files containing study documentation. When the HSDA program reads the HARC file, it checks to see if there are any files defined as being available for downloading. If it finds such definitions, it includes the downloading option on the HTML form presented to the user to select an action.

Examples of the types of files that can be set up for downloading include the following:

Data files (plain ASCII or zipped or compressed),
DDL or DDI files to match the data files,
Data definition files for SAS, SPSS, or Stata,
Printable codebook or documentation files,
Self-extracting archive files that expand into sets of HTML files.

More than one file of each type can be made available for a specific study, and labels for each file can be specified to aid the user in selecting a file for downloading.

Each file for downloading is identified by a URL. The last component of the URL will be used as the default filename. To avoid confusing the user’s browser, that filename should include a suffix to indicate the type of file -- e.g., ‘.txt’ for a plain text file, ‘.zip’ for a zipped file, and ‘.exe’ for a self- extracting zipped archive file.

CREATE AND DOWNLOAD A CUSTOMIZED SUBSET OF THE DATA FILE

An archive can also use the interface provided by the HSDA program to allow users to create a customized subset of a data file, together with documentation of that subset. When the HSDA program reads the HARC file, it checks to see if there are any files defined as being available for subsetting. The source file for a subset operation must be an ASCII data file; a matching DDL file is also required. If it finds such definitions, it includes the customized subset option on the HTML form presented to the user to select an action.

The following files can be obtained from a subset operation:

Data file for the subset (plain fixed-column ASCII, with optional space or comma delimiters between variables)
Data definitions for the subset, for SAS, SPSS, and Stata
DDL file or DDI file describing the subset
Codebook to document the subset (plain text file, formatted for viewing or printing)

RESTRICTING ACCESS

An archive may wish to restrict online access to some or all of the codebooks or datasets that are referenced in the HARC file. Since the HSDA program does not include security features of its own, archivists should use the features built into their server software.

The server software installed on the archive’s server computer generally includes security features. For example, it is possible to require that users give a password in order to access certain files or directories. Access can also generally be limited to specific users or host computers. Once users have been authenticated by the server, they can then be allowed to execute the SDA programs and procedures.

EXAMPLE 1 - EXECUTE HSDA WITH 2 ARGUMENTS (HARC + STUDY_NAME)

(The HARC file is ‘harctest’; the study name is ‘nes98’ or ‘gss98’.)

<html> <head> <title> SDA Archive for the 1998 NES and GSS Surveys </title> </head> <body> <h2> SDA Archive for 1998 NES and GSS Surveys </h2> The codebooks and data files for the 1998 NES and GSS Surveys are set up here as an SDA archive for the Web. At this site you can browse the documentation of the surveys and do a little analysis. <p> Select one of the following studies: <p> <a href="/cgi-bin/hsda?harctest+nes98"> 1998 NES Survey</a> <p> <a href="/cgi-bin/hsda?harctest+gss98"> 1998 GSS Survey</a> <p> </body> </html>

EXAMPLE 2 - EXECUTE HSDA WITH 3 ARGUMENTS (HARC + STUDY_NAME + OUTSTUDY)

(The HARC file is ‘harctest’; the study name is ‘nes98’ or ‘gss98’. The directories ’/newvars/nes98’ and ’/newvars/gss98’ have been set up appropriately as SDA datasets.)

<html> <head> <title> My SDA Archive </title> </head> <body> <h2> My SDA Archive </h2> The codebooks and data files for the 1998 NES and GSS Surveys are set up here as an SDA archive for the Web. <p> You can also create recodes and computed variables for each survey and use them in your analysis. <p> Select one of the following studies: <p> <a href="/cgi-bin/hsda?harctest+nes98+/newvars/nes98"> 1998 NES Survey</a> <p> <a href="/cgi-bin/hsda?harctest+gss98+/newvars/gss98"> 1998 GSS Survey</a> <p> </body> </html>

EXAMPLE 3 - EXECUTE HSDA IN DEBUGGING MODE (HARC + debug)

(The HARC file is ‘harctest’ with ’DEBUG=YES’ specified in the HARC file.)

<html> <head> <title> Debug or check syntax in the HARC file </title> </head> <body> <h2> Debug or check syntax in the HARC file </h2> <p> <a href="/cgi-bin/hsda?harctest+debug"> Check the HARC file</a> </body> </html>

harc	HTML archive specification file
language	Create a non-English user interface
xcodebk	Produce a codebook for an SDA dataset