SDA 3.5 Documentation for HSDA
NAME
hsda - HTML link to SDA programs
USAGE
hsda?HARC_filename+study_name
or
hsda?HARC_filename+study_name+outstudy_path
or
hsda?HARC_filename+debug (debugging mode)
or
hsda -t filename (output a language file)
DESCRIPTION
The HSDA program provides the link between an HTML document and
the SDA programs available for online use on the World Wide Web.
HSDA presents the user with an option screen and then executes
the SDA program required to carry out the selected option.
A supplementary function of the program is to output the language
strings used by HSDA to create the option screens for each of the
SDA programs. If HSDA is run from the command line with the ‘-t’
flag, the program will write out all the strings into the file
specified after the ‘-t’. They can then be replaced with strings
in another language. HSDA will use the revised strings file to
create option screens, if the name of the directory containing
the modified file is specified with the ‘LANGUAGE=’ keyword in
the
general section of the HARC file.
See the
language document
for details.
CONTENTS OF THIS DOCUMENT
OVERVIEW
The HSDA program reads the HTML archive specification file
(HARC file)
to determine which codebooks, datasets, and analysis programs are
available and where they are located.
Unless HSDA is being run in debugging mode, the program then
generates an SDA option selection form, displays the form on the
user’s browser, and waits for the user to select one of the
available actions to take.
MODES OF OPERATION
- A Study Name Is Specified
When the reference to the HARC file contains the name of a
specific study (after the plus sign, as in the first model given
in the
USAGE
section above), that study is automatically selected. The action
to be taken applies only to the study that has been selected.
(See
Example 1
below.)
- A Study Name AND an Outstudy Path are Specified
If the reference to the HARC file contains BOTH the name of a
specific study (after the first plus sign, as in the second model
given in the
USAGE
section above), AND the pathname of an SDA dataset directory
(after the second plus sign), two things happen:
First, the specified study is selected. The action to be taken
applies only to that study.
Second, the pathname of the SDA dataset directory is interpreted
as if it were specified with the ’OUTSTUDY=’ keyword for this
study in the HARC file. That dataset directory also becomes the
last SDA dataset searched for variables, just as if it were
specified with the last ’SDADATA=’ keyword for this study in the
HARC file. (See
Example 2
below.)
The purpose of this option is to allow users to create recoded
and computed variables and to store them in an SDA dataset of
their own. Those variables will then be available for use by any
of the SDA analysis programs.
The specified SDA dataset directory must be writable by the Web
server process; therefore, write permission will have to be
enabled for anyone in the user group to which the Web server
process belongs. That SDA dataset directory must have two
subdirectories in it named ’VARS’ and ’STUDYINF’; they also must
be writeable by anyone in the user group to which the Web server
process belongs. Furthermore, the ’VARS’ subdirectory must
contain a copy of the CASEID variable for the study; that
variable must be readable by the Web server group. However, the
directories that are the parents of the SDA dataset directory do
not need to be either readable or writeable; but they must be
searchable by the group to which the Web server process belongs,
so that the Web server can access subdirectories within them.
- Debugging Mode
If the HSDA program is executed with the word ‘debug’ as the
first argument after a plus sign (the third model given above in
the
USAGE
section), HSDA will check the HARC file for errors. The program
performs a syntax check on the HARC file, and it checks that the
files and directories specified in the HARC file actually exist
and can be accessed in the appropriate mode (reading or writing).
Provided that JavaScript is enabled, HSDA also tries to validate
the URLs in the HARC file. After all the checking has been done,
HSDA sends a report to the browser.
A HARC file can be checked or debugged from any browser by
executing a URL of the following form, which uses the word
‘debug’ instead of the name of a dataset after the plus sign:
http://sda.berkeley.edu/cgi-bin/hsda?harcfile+debug
(See
Example 3
below.)
Debug mode is disabled by default. To enable
it, add a ’DEBUG=YES’ line in the [GENERAL] section of the HARC
file. Once a HARC file is checked and debugged, this access to
debug mode can be turned off by removing the ’DEBUG=YES’ line or
commenting it out by placing a ’#’ in the first column of the
line. (Note: debug mode was enabled by default until release #2
(May 2004) of SDA 1.3.)
ACTIONS TO TAKE
Currently, HSDA can execute any of the following actions,
provided they are enabled for the selected dataset:
The HTML page that calls or executes the HSDA program must
always specify (as an argument) the name of the HARC file which
the program is to use. The HARC file, along with the HSDA
program, is assumed to be in the directory containing CGI
programs. (However, the HARC file can be located elsewhere on
the server, provided that the full pathname is specified as the
argument for HSDA.) Under UNIX, this directory is usually named
‘cgi-bin’, although the Web server can be configured to search
other directories for the CGI programs. The layout and contents
of the
HARC file
are described in a separate document.
The HTML page that calls the HSDA program must also specify (as a
second argument) the name of one of the studies defined in the
HARC file. The HSDA program will then generate the appropriate
SDA form for that study. That form will provide access to the
available codebooks, SDA analysis programs, and files for
subsetting and/or downloading.
VIEW THE STUDY DOCUMENTATION OR CODEBOOK(S)
One of the actions that the user may select is to view the study
documentation or codebook for the selected study. The location
of the base HTML file for the codebook must be specified in the
HARC file. The same SDA
XCODEBK program
that generates the HTML codebook also creates the ’tree_items.js’
file used by the new SDA interface to display the variable tree.
If more than one codebook is specified in the HARC file, the user
will be able to choose one at a time. Multiple codebooks are
each given a label, so that the user can distinguish between
them.
RUN ANALYSIS PROGRAMS
The HSDA program provides online access to SDA analysis programs
by generating an HTML form, specific to each analysis program, to
gather the names of variables and the options desired by the
user. The specified variable names and options are incorporated
into a file of batch commands. The appropriate SDA analysis
program is then executed in batch mode, using the file of
commands prepared by HSDA.
HSDA provides for the online execution of several SDA programs
for analyzing variables, creating and listing new variables, and
listing the contents of selected variables for selected cases.
The program names and the pathname of the location of SDA
programs on the server computer must be specified in the HARC
file.
For online analysis to be made available for a particular study,
an SDA dataset for that study must be located on the server
computer. The pathname of the appropriate SDA dataset directory
must be given in the HARC file. A single study may contain
variables in more than one SDA dataset directory, provided that
all the datasets have the same number of cases and that the
sequence of the cases in each dataset is the same.
DOWNLOAD PRE-EXISTING DATA OR DOCUMENTATION FILES
An archive can use the interface provided by the HSDA program to
distribute pre-existing data files or files containing study
documentation. When the HSDA program reads the HARC file, it
checks to see if there are any files defined as being available
for downloading. If it finds such definitions, it includes the
downloading option on the HTML form presented to the user to
select an action.
Examples of the types of files that can be set up for downloading
include the following:
- Data files (plain ASCII or zipped or compressed),
- DDL or DDI files to match the data files,
- Data definition files for SAS, SPSS, or Stata,
- Printable codebook or documentation files,
- Self-extracting archive files that expand into sets of HTML
files.
More than one file of each type can be made available for a
specific study, and labels for each file can be specified to aid
the user in selecting a file for downloading.
Each file for downloading is identified by a URL. The last
component of the URL will be used as the default filename. To
avoid confusing the user’s browser, that filename should include
a suffix to indicate the type of file -- e.g., ‘.txt’ for a plain
text file, ‘.zip’ for a zipped file, and ‘.exe’ for a self-
extracting zipped archive file.
CREATE AND DOWNLOAD A CUSTOMIZED SUBSET OF THE DATA FILE
An archive can also use the interface provided by the HSDA
program to allow users to create a customized subset of a data
file, together with documentation of that subset. When the HSDA
program reads the HARC file, it checks to see if there are any
files defined as being available for subsetting. The source file
for a subset operation must be an ASCII data file; a matching DDL
file is also required. If it finds such definitions, it includes
the customized subset option on the HTML form presented to the
user to select an action.
The following files can be obtained from a subset operation:
- Data file for the subset (plain fixed-column ASCII, with
optional space or comma delimiters between variables)
- Data definitions for the subset, for SAS, SPSS, and Stata
- DDL file or DDI file describing the subset
- Codebook to document the subset (plain text file, formatted
for viewing or printing)
RESTRICTING ACCESS
An archive may wish to restrict online access to some or all of
the codebooks or datasets that are referenced in the HARC file.
Since the HSDA program does not include security features of its
own, archivists should use the features built into their server
software.
The server software installed on the archive’s server computer
generally includes security features. For example, it is
possible to require that users give a password in order to access
certain files or directories. Access can also generally be
limited to specific users or host computers. Once users have
been authenticated by the server, they can then be allowed to
execute the SDA programs and procedures.
EXAMPLE 1 - EXECUTE HSDA WITH 2 ARGUMENTS (HARC + STUDY_NAME)
(The HARC file is ‘harctest’; the study name is ‘nes98’ or
‘gss98’.)
SDA Archive for the 1998 NES and GSS Surveys
SDA Archive for 1998 NES and GSS Surveys
The codebooks and data files for the 1998 NES and GSS
Surveys are set up here as an SDA archive for the Web.
At this site you can browse the documentation of
the surveys and do a little analysis.
Select one of the following studies:
1998 NES Survey
1998 GSS Survey
EXAMPLE 2 - EXECUTE HSDA WITH 3 ARGUMENTS (HARC + STUDY_NAME + OUTSTUDY)
(The HARC file is ‘harctest’; the study name is ‘nes98’ or
‘gss98’.
The directories ’/newvars/nes98’ and ’/newvars/gss98’ have been
set up appropriately as SDA datasets.)
My SDA Archive
My SDA Archive
The codebooks and data files for the 1998 NES and GSS
Surveys are set up here as an SDA archive for the Web.
You can also create recodes and computed variables for
each survey and use them in your analysis.
Select one of the following studies:
1998 NES Survey
1998 GSS Survey
EXAMPLE 3 - EXECUTE HSDA IN DEBUGGING MODE (HARC + debug)
(The HARC file is ‘harctest’ with ’DEBUG=YES’ specified in the
HARC file.)
Debug or check syntax in the HARC file
Debug or check syntax in the HARC file
Check the HARC file
SEE ALSO
harc |
HTML archive specification file |
language |
Create a non-English user interface |
xcodebk |
Produce a codebook for an SDA dataset |
CSM, UC Berkeley
April 12, 2011