Manfred Kuechler
 

Version: February 9, 2007
 

Guidelines for Attribution and Citation in Internet Research Papers

 
This is a summary of the various issues related to formatting and to attribution and citation in "Internet Research papers" also referred to as online (htm, web format) papers. To make this document useful beyond any specific course, I have included some information/advice which you may not be able to follow for the paper in your specific class or which may not be relevant for your particular assignment. So, use this document as a checklist and follow the advice as best as you can. 

If you have any questions about the contents of this document, please post them on the Discussion Board of  the Bb course web site. I will respond to them there and also use these questions for future versions of this document.

Note that a number of "screen movies" demonstrating how to do certain technical tasks will be made available in  the "Course Documents" area course site for GSR716 (Spring 2007). These screen movies  are already available on the GSR 716 site for Spring 2006.  Most of these screen movies were produced for the Fall 2003 class of GSR716. Consequently, to the extent that these screen movies capture some of the look of  Fall 2003 Bb course site (using of previous version of the Bb software), there will be some (inconsequential) differences compared to the current Bb course site. Time permitting, I will try to redo most of these screen movies based on  current versions of the software involved for the 2007 class.

Table of Contents

What is an "online paper"?

I use this term to refer to any document in html (Hyper Text Markup Language) format. Such documents can be displayed by any web browser and are therefore particularly well suited to be distributed via a web server. Compared to common word processor formats (.doc for MS Word and .wpd for Corel WordPerfect) with the same substantive content, the file size of htm documents is considerably smaller. Htm documents can be produced by many different software products ranging from very simple text editors (like Notepad) over specialized htm editors (like Netscape Composer, Nvu, or MS FrontPage) to the common word processing software (like MS Word and Corel WordPerfect). Starting with the "2000" versions, both MS Word and Corel WordPerfect allow simple "two mouse click" conversions of their standard format to htm (web) format (more advice).

An alternative to html for an online paper is the pdf (portable data format). Many organizations, especially government agencies, use this format, as it allows complete control over the page layout (which is lost when converting to htm). However, you would need the full Adobe Acrobat software at a cost of about $140 (academic edition of version 8) to produce such files; while the Adobe/Acrobat Reader (to view such documents) is free. Starting with version 6 (released in 2003), there are several new features (including better access for people for visual impairments) which makes pdf an even more attractive alternative for producing online papers, but the full product runs on Windows 2000/XP only. You can access the full version 7 (and hopefully 8 at some point)  in the Hunter labs. This version (released in January 2005) also allows the inclusion of animations and multimedia material.  The version 8 Reader is still free, but now even the Reader requires Windows XP.

As to your paper (first draft and final version), I expect htm format, but you are free to use whatever software you are comfortable with to produce this document in htm format. I assume that most of you will use some version of  MS Word (2000,  XP/2002, or 2003) and its built-in htm conversion.
 

Internal division of an html paper

As there is no fixed page layout in htm format, you cannot refer to specific parts of your paper by page numbers. Therefore, a clear structure with sections and subsections and corresponding headings is crucial (though it is good practice to have these in a standard hard copy paper as well). In technical html lingo, each section -- as well as any specific part of a document -- is called a fragment and you need to mark and name such "fragments". If you use MS Word, simply insert a "bookmark" at the start of each fragment; if you use Netscape/Mozilla/Firefox  insert a "target" (the term "bookmark" means something else in the context of Netscape/Mozilla/Firefox); the current version 7 of the Netscape composer as well as Nvu uses the term "named anchor".  You must name such "bookmarks" (MS Word) or "targets"/"anchors" (NS Composer/Nvu) so that a reader can be sent directly to a specific point in your paper. Which name you select is irrelevant as long as you don't duplicate any name; for your own convenience, I suggest that you use names reminding you of the specific contents at this point of the paper. E.g., I have named the target/bookmark/anchor at the beginning of this section "internal".

Named "fragments" can be used for both internal links (within the same document) and external links (links in other documents pointing to a specific part of your document). A very common use of internal links is for a table of contents which you may want to put at the very start of your paper (as in this document). But you can use such internal links anywhere in your document -- whenever you feel it is useful to remind the reader of an argument made before and/or some evidence introduced above or below. Internal links are particularly simple. Instead of an URL (something like "http://maxweber.cuny.edu/socio/index.htm") they simply use a hash mark followed by the name of the bookmark/target/anchor (something like "#internal"). For those, who have ventured a bit into the actual htm language, the complete "tag" for an internal links looks like this: <a href="#internal">.....anchor text....</a>. For external links, the part sufficient for internal links is simply added to the normal URL (like "http://maxweber.cuny.edu/socio/index.htm#hot").

To summarize: In addition to what you would do in a conventional paper, you need to name specific fragments (points in your paper to which you want to send a reader directly) and this "sending directly" is done via internal links. Named fragments also allow external links to point to a specific part of your paper rather than just to the start.
 

Attribution of conventional sources (published books and journal articles)

Traditional "library resources" (published books and journal articles and similar material) should be referenced like in any other paper. An online paper must have a "reference" or "bibliography" section just like a hard copy paper and you should follow any of the commonly used formats (ASA, APA, MLA, Chicago). Personally, I don't really care what you use; you could even invent your own system as long as you use it consistently and as long as all necessary details are supplied. However, many professors insist on a specific reference style and when you submit a manuscript to a journal, this journal will also insist on one particular style. So, it pays to be familiar with at least one or two of the common styles. With the availability of the EndNote software this burden is greatly reduced as the software will take care of all the details; you just select a specific "style" and the software will put the references (contained in your EndNote "library") into this specific style. You still have to collect the information first, of course.

There are two exceptions to my laissez-faire attitude about which style you use: 

The preferred way of making an attribution to a published work is to include author and year; either both in parenthesis or by adding the year of the publication in parentheses to the author name's when this name is part of the text anyway.  Again, I recommend consulting a standard textbook (e.g., Raimes 2008)  on writing if you are unsure on how to handle this in a conventional paper. Raimes (2005) is one of my favorite textbooks, and if this does not enough detail for you, there is a larger "handbook" version as well (Raimes and Jerskey  2008). As to publications with more than one author, the different styles have different rules as to how many you need to include and how exactly to list them in the reference section, but I don't care about these nitty-gritty details. For books, consider adding internal links  to the bibliographical details using the  author name as an anchor -- as I did above.  For journal articles available online, you should rather link to the full text (see below).

These days, there are still relatively few books available in electronic format (online), but when it comes to journal articles chances are that the full text is available online.  If this is the case, a link to the online source of the article must be added. However, you need to be careful that the link you add is a "persistent" link. In most online data bases, you use a search engine to locate a specific article and in many cases the URL which appears in the address box of your browser window is not persistent. It may work for a few hours, sometimes even for a day or two, but then it ceases to work. When checking whether a specific URL is persistent, make sure that you empty the cache of your browser first [2], or you may be fooled into the thinking that the URL is persistent because the document is displayed from the browser's cache rather than from the source, and, consequently, the link would not work on another computer.  Detailed instructions for finding persistent URLs for the most commonly used journal databases are provided in a separate document.

Adding the URL for the online location. Once you have secured a persistent URL for a journal article, the most plausible way to include this in your paper is to make the name of author(s) the "anchor" for this link. Like: The links point to the location of the article in the EBSCO data base. For these links to work, the reader of the document must be authorized to access this data base. And with the introduction of  EZ proxy (see the overview  document on finding published research), which in many respects makes it easier to use licensed resources from off campus, there is now also a complication. More precisely, you have a choice between two forms of the "persistent URL" for the article:
The generic form allows the reader of your paper to establish his/her authorization to use the EBSCO data base in a variety of ways (e.g.,  by  accessing EBSCO via the CUNY portal first),  whereas the second form requires authorization (authentication) via a Hunter e-mail user name and password.  So, the second from is more convenient for readers at Hunter but  a gross inconvenience for non-Hunter readers. To demonstrate the difference, I have used the Hunter-specific form for the first "Zunes" link above, and the generic form for the second. (You will notice a difference only when using these links on a non-Hunter computer.) If you are writing a paper (solely) for a course at Hunter, use the second form; if you are writing a paper where the anticipated readership includes non-Hunter people, use the generic form.

When you connect via the Hunter EZ proxy server,  the persistent URL  displayed on screen is modified for some databases  -- including EBSCO -- while for other databases -- including JSTOR -- it reamins the same. See the two examples below:



However, when exporting references in RIS format (EBSCO, JSTOR, and other databases) the persistent/stable URLs in their original (generic)  form are used.  So,  making these URLs work in your paper from off campus (for Hunter folks) requires that you manually modify the URLs so shown above. And as the papers in this class are written for a Hunter readership, you should provide the modified versions in your paper.


PDF version of journal articles. When using the htm version of an online journal article, we are not able to make precise page references to specific parts of the journal article (as you should when you use a verbatim quote as well as when you paraphrase a specific argument), and with Zunes article mentioned above, we are out of luck, no pdf version is currently available. But here is a quotation from another article:

When you use the pdf version, you get the same page layout as in the printed journal and consequently you can make specific page references. The link to the pdf file still points to the beginning of the article only (there are no "named fragments"), but once the document is displayed in the Acrobat reader it is very easy to jump to a particular page. The drawback of using the pdf version is that in many cases these files are huge. In the example above, the pdf version of this article of just 9 pages is 2.3 MB, whereas the htm version of the same article is a mere 165 KB -- so it downloads much quicker. But the html version of an article often does not include (all) graphics, tables, and charts -- though some do.[3]  Also, note that in EBSCO, the persistent link points to the record of the article which in turn may include links to both the htm and the pdf version of the text (if available); there are no direct persistent links to the pdf version in EBSCO.

General recommendation: When you are still just screening articles, use the htm version. Once you decide that an article will be an important source for your paper, get the pdf version (if available) so that you can properly cite and quote using page numbers.
 

Adding the URL to reference section. The URL for the online source of a journal should also be added to the reference section. Here, however, you should add the URL as visible text -- as even online papers get printed out, because many people find it hard to read a paper on screen only. But you should also make the visible text of the URL an anchor for a link to this location (see the reference section of this document for an example). When you use MS Word to write your document, you can set an option which provides for automatic conversion of any text starting with "http://" into a link [4]. This may even be the default setting. So, there is little extra work involved.

The exact form of how you add the URL varies with the style (MLA, APA, etc.) you pick. Some styles require that the URL is place in angle brackets ( <  ....   > ), some require to precede the URL with the string "Available at:", but these are the nitty-gritty details I don't care about. A word of warning, though. Some these URLs  can be quite long and complicated. Avoid retyping URLs -- always copy-and-paste. In particular, if you have printed out the journal article yourself, the URL (or part thereof) may show on your printout (depending on how you have set up your browser), but very likely it is not the complete URL. Whenever you see a string of dots embedded in what you may think is the complete URL, it is not the complete URL. There are no working URLs which include a series of dots (like http://infotrac.com/webrequest/...../ajs45367.htm).  Of course, by using EndNote and the EndNote/MS Word interface these  problems are avoided anyway.

Attribution of general web documents

Much of what has been discussed in the previous section, applies here as well -- especially the caution about the persistence of links and manually retyping URLs. But there are some differences as well.

Choice of anchor and target page. By "anchor" I mean the string of visible text that you turn into a hyperlink (typically shown in blue and underlined -- though you could make other choices). In case of journal articles, it is obvious where the link will lead you; but this is not true in case of general web documents. As a matter of fact, in a conventional paper you do not use a lot of such "primary sources". With respect to such primary sources, there are two ground rules:

It is not necessary to list every single link in the reference section, but you want to list all major documents and certainly all starting pages of web sites which you have used as sources. Some web documents may be fairly long and some may have an author identified by name and a specific title. These documents can be treated much the same way as "gray literature" (non-published, but circulated papers like those presented at professional meetings). Use the rules in your preferred reference style (MLA, APA, etc.) accordingly. Again, using EndNote software greatly facilitates this task. Use your "EndNote library" to document your web searches.

Non-persistence of web documents. The persistence problem is even more prevalent when considering general web documents and it comes in several varieties:

For many reasons, you cannot rely on web documents to remain available with exactly the same contents -- the way you can rely on journal articles and books. Therefore, it is necessary to download and save all documents which you are consider crucial pieces of evidence, sources central to what you present in your paper and include these crucial documents in an appendix -- unless you are confident that the URL is persistent and the contents will not change (like the Supreme Court decisions site at Cornell, and there are many other sites that can be assumed to be constant). [7]
 

The "sources appendix"

All downloaded web pages are to be placed in the "sources appendix" and in the main body of your paper you link to these downloaded pages as you would if these web pages were still available with unchanged content. But rather than using external links (pointing to an outside web site), you use "internal" and/or "relative" links; "internal" links if you have main paper, reference section, and "sources appendix" all in one big htm file (use only if the source appendix is small), "relative" links if your  "sources appendix" is a set of files or a folder of files "zipped" together with the file containing the main part of your paper.

Downloading web documents which are essential sources for your paper gives you the added advantage that you can edit most of these (an exception would be secure pdf documents where the creator may have barred any alteration, even if you have access to the full Acrobat software). Of course, you don't want to change the contents of such documents. By "editing" I only mean adding bookmarks/targets so that you can link to the specific portions of such documents from your main paper. This allows for much more precise attribution as few original web documents come with a sufficient number of "named fragments."

You place such links throughout your paper wherever you would use a link to the external web site, if you had a persistent URL and confidence that the page would stay the same. However, whether you link to a downloaded web page in the appendix or to the original page on a web server elsewhere, the same rules apply as to clearly describing to what kind of document the link will lead you to. This is particular important when you link to a "named fragment" as the reader will not immediately see the heading and possibly other identifying information at the top of this page.

Listing of appendix content. Also, you need to provide a "table of contents" for such an appendix (which should be the very last section of the main paper) and each entry in this appendix TOC should be a hyperlink to the document. Each entry must contain the information which would otherwise be listed in the reference section, including the original URL (even if non-persistent) and the date of the download. Long non-persistent URLs should be shortened to the page on the remote site where your search started; instead of the complete (but by now useless) URL the search string used to locate the document should be listed. What you list as part of the appendix TOC, need not be listed in the reference section to avoid duplication.

A sample appendix. For demonstration purposes, I have included an appendix (consisting of three downloaded web pages) for this document. I have added a fragment name to the first document in the appendix so that I link directly to the issues PFAW names on its rather long start page. The second document was downloaded from the Lexis-Nexis data base and contains information about public sentiment on school voucher from a survey conducted in January 2001. As this data base is "restricted", i.e., it can be accessed via an authorization process only (via specific IP address or specific ID/password), it is always preferable to find another generally available source instead. In this example, the specific public opinion data contained in the document can also be found at the generally available web site of the Gallup organization, but on an "active service page" (see above) only and somewhat buried in a fairly long page. So, it was necessary to download this page and to add a fragment name to point to the specific part of the document. Of course, in general it is not necessary to have the same piece of information twice -- as in this demonstration.

Technical structure. From a purely technical point of view, there are two ways to organize such an appendix:

When using an appendix of this kind (especially, when using the folder approach) make sure to define the links in your main paper properly. You need to use "relative links" simply specifying the location of any such file relative to the location of your main paper. It is easiest to keep the main paper in the same folder as all these downloaded pages. However, you may inadvertently overwrite files when different downloaded pages contain associated files (e.g., graphics files) with the same name like "arrow.gif". But if you take this risk, things stay simple: In this case, you just type the file name plus any "fragment name" when asked to specify an URL for the link.[8] Next we discuss the situation when the files making up the appendix are in different subfolders.

Relative links can include a "path" as well, meaning you link to documents which are located in a subfolder of the folder where you main document resides. This is a useful device for keeping order, but it be may be "forced" upon you as well. E.g., when you save a web page (containing graphics) with MS IE or directly with MS Word 2000 (by typing an URL to a web page in place of a local file name when "opening" a file) the main (text) part of the page is saved in one file and all associated (graphics) files are saved in a subfolder. This subfolder has the same name as the main files, but the string "_files" is appended. -- When you place files in subfolders (relative to your main document) yourself and then have to enter a hyperlink yourself, make sure that that the link does not contain any backward slashes ("\"), all slashes must be forward slashes ("/") -- notwithstanding that such "paths" on Windows computer use backslashes. For example, enter in the box where you specify the location of the file something like:
subfolder/docs/org1.htm
not
subfolder\docs\org1.htm
though the latter is the correct way to specify a file location on a Windows computers. To make matters even more confusing, if you make this mistake, you will not notice it right away. As long as the files reside on a local computer, browsers are smart enough to figure out what you mean. But when the files get moved to a web server (e.g., on to a course web page), these backslashes will produce error messages as the files you link to cannot be found. If you have picked up a bit of html language, the full "tag" for a relative link looks like this:
<a href="subfolder/docs/org1.htm"> or
<a href="subfolder/docs/org1.htm#name">  if you link to a specific "named fragment" in the file "org1.htm".
In contrast, just a reminder "absolute" links look like this, the string you specify starts with "http://"
<a href="http://bb.hunter.cuny.edu/courses/1/2001SP-GSR716-00/content/subfolder/docs/org1.htm">
This is not a real link, don't expect it to work. But make sure that you don't leave a space before the http part, or your browser tries to interpret your link as a relative link -- and an error message will result.

Finally, you need to be careful when preparing your zip file as the "path information" must be preserved.  Check the screen movie on producing zip files.
 

A brief note on Plagiarism

[I have added this section with considerable reluctance, as I would like to assume that no student would ever do this, at least not one of my students. But actual experience has taught me differently. In order to be able to punish offenders, it seems to be necessary to be explicit about the rules and the possible consequences.]

"The word plagiarize is derived from a  Latin verb meaning 'to kidnap', and kidnapping or stealing someone else's ideas and presenting them as your own is regarded as a serious offense in Western academic culture and public life." (Raimes 2005:116)  If your are unsure exactly what constitutes plagiarism and where to draw the line between plagiarism and paraphrasing consult a  textbook on writing like Raimes (2005:116-128). Another good source is "Avoiding and Detecting Plagiarism" , a guide for graduate students and faculty  published by the CUNY Graduate Center in March 2005.

Note that simply adding a link to the source is not sufficient. Even a paper in html format needs to stand on its own and needs to be fully comprehensible without following any of the links. Therefore, any direct citation of text found in other (online) documents must be clearly marked in your paper. E.g., if you copy from the syllabus of a Supreme Court decision (rather than paraphrasing), you need to use quotation marks, and the same holds for bills found in Thomas or similar data bases. Be aware of excessive use of verbatim quotations. Do not use quotations to mask a lack of understanding the contents (and, yes, "legalese" or other highly technical language can be hard to fully comprehend at times).

In Fall 2003, Hunter College started a subscription to a commercial service (Turnitin) which allows instructors to check on the "originality" of any text. It takes little effort to have a student paper checked and all passages marked which appear in identical or very similar form anywhere on the Web. So, make sure that you avoid any suspicion of plagiarism in your  paper by carefully marking all passages copied from other sources.

If I detect plagiarism or any other form of academic dishonesty, I  will  enforce the CUNY Policy on Academic Integrity and will pursue cases according to the Hunter College Academic Integrity Procedures.

Endnotes

[1] I have included this endnote for demonstration purposes only. To go back to where you were before in the document, simply click the "Back" button of your browser.
[2] In Netscape 7.x or Mozilla 1.7.x, go to "Edit"/"Preferences"/"Advanced"/"Cache" and click the button labeled "Clear Disk Cache". In Firefox 1.x, go to "Tools"/"Options"/"Privacy"/"Cache" and click the "clear" button; in Firefox 2.x, go to "Tools"/"Options"/"Advanced"/"Network"/"Cache" and click "clear now".  In MS IE 6, go to "Tools"/"Internet Options". Click on the "Delete Files .." button in the "Temporary Internet files" section of the "General" tab, in the pop-up window check the box "Delete all offline contents" and click OK. In MS IE 7, go to  "Tools"/"Internet Options".  Click on the "Delete ..." button in the "Browsing History" section of the "General" tab, in the pop-up window check the box "Delete files" in the "Temporary Internet Files" section.
[3] Note that there are "searchable" and "non-searchable" pdf versions of articles. The first kind is much better as you can search for specific text within the article and you even may be able to copy-and-paste quotations (depending on the security options set by the producer of the pdf document). In addition, the file size is considerably smaller. But these pdf files require more effort in producing them, so many pdf files in the journal data bases are of the second, less desirable kind.
[4] See separate document for the details of producing htm documents with MS Word.
[5] Make sure that you save all parts of the page including associated images. To do this, use the "whole page option" in MS IE. When using Netscape, you need to go to "edit page" first (opening Netscape Composer) and then save the document. Composer will save all associated files along with the main text file.  You can also save complete web pages by opening them directly in Adobe Acrobat, but you need access to the full software (not just the reader) -- or you can use the Acrobat icon in MS IE (if available).
[6] In Firefox,  you can always right click somewhere on a frame and select "Open frame in a new window" and check the address box there. Also, by right clicking and selecting to "open in a new tab", you will get the document URL displayed in the address box. In MS IE, you need right-click on the document and then select "Properties": a new window will pop up which contains the specific address (URL) and other information. Now, highlight the URL in this window and use CTRL-C (holding down the CTRL key and pressing "C") to copy the URL. Then proceed as if you had copied the URL from the address box. However, if the URL is very long the pop up window may not show the complete URL right away. You may have to scroll to make sure that you highlight and copy the complete URL. An alternative approach is to right click on a frame, and then "add" this frame "to favorites" and check the properties of this favorite (opening the favorites folder, right-clicking on the specific favorite and selecting properties).
[7] As long as you don't distribute your paper outside, there is no copyright problem with this practice. This is "fair use". And as long as I keep your paper in a secure area of our course web site, this does not create a copyright problem either.
[8] I have taken this risk for the sample (demonstration) appendix to this document. So, you may discover some out of place graphics elements -- in case certain files belonging to previous downloaded pages were overwritten by files of the same name belonging to pages downloaded later. I have used Netscape (Composer) to save complete pages; when you use MS IE, you are safe as associated files always get placed in a subfolder to avoid such inadvertent overwriting of files. And this is pretty much standard procedure now, Firefox 2.x does the same.
 

References

Prentice, Julia C.; Pebley, Anne R.; Sastry, Narayan. "Immigration Status and Health Insurance Coverage: Who Gains? Who Loses?" American Journal of Public Health 95, 2005, p. 109-116. http://search.epnet.com/login.aspx?direct=true&db=aph&an=15557409 

Raimes, Ann. Keys for Writers: A Brief Handbook -- Fifth Edition. Boston: Houghton-Mifflin, 2008.

Raimes, Ann and Maria Jerskey.  Universal Keys for Writers -- Second Edition. Boston: Houghton-Mifflin, 2008.

Zunes, Stephen. "The American peace movement and the Middle East." Arab Studies Quarterly 20, 1998, p. 29-52.
http://search.epnet.com/direct.asp?AN=612500&db=aph&
 
Note that I have used the generic (rather than the Hunter-specific) form of the URL and sometimes these seem to  work even without being authenticated. However, most likely you have been authenticated at some time but just not closed your Internet connection in between (many broadband users simply stay connected all the time). For your paper, you are supposed to use the Hunter-specific form of the URL. Also, there are some variations in the structure of these URLs (look for the slight differences between the URL for the Prentice piece and the Zunes article. These differences are irrelevant and have no deeper meaning or importance.  -- Also note that this is not APA/GSR716 style; e.g., APA style uses only initial letter for the first name. 


Appendix

  1. People for the American Way (PFAW) web site (start page)
  2. Original URL: http://www.pfaw.org ; page downloaded on 4/27/2001

  3. Gallup Poll result on "School vouchers" (Jan 2001) as documented in Roper Center Poll data base via Lexis-Nexis
  4. Original URL: http://web.lexis-nexis.com/universe/form/academic/s_roper.html
    Search string: Keyword = school voucher , Roper Accession Number = 0376449
    Page downloaded on 5/4/2001

  5. Public opinion on "School vouchers" (Jan 2001) from Gallup Organization
  6. Original URL: http://www.gallup.com/poll/indicators/indeducation.asp ; page downloaded 5/4/2001
    (Note that the original URL no longer works; a good illustration why it is necessary to download crucial documents and place them in an appendix rather than relying on the URL to work forever.)