Why Not Convert The Penn State Event Data Project to Windows?

Note: Footnotes (*) and email commentary are found at the end of this page.

The opinions expressed on this page are those of Philip A. Schrodt, a lowly programmer, and do not necessarily represent a consensus within the Penn State Event Data project.

Initially posted: 20 February 1998

[ Macintosh | Source code | Evil empire | Proposal | Footnotes | Email | Postscript,1999 ]

Fundamentally, there are three reasons not to convert the Penn State Event Data Project to Windows:

The Macintosh version works

The Penn State Event Data Project works fine on the Mac. Okay, there are still a small number of unresolved bugs, but only a small number.

Meanwhile, there are an estimated 20-million Macs in use (1) -- we're not talking the Amiga or CP/M here. The Penn State Event Data Project will run on any Macintosh with 2 Mb of memory. We recently purchased a pre-owned Mac from the local used computer emporium for $350, and that included a color monitor. Our cheapest Penn State Event Data Project-capable machine is an SE that we fished out of a dumpster and enhanced with a used hard disk we bought for $25. The Penn State Event Data Project is free; the BBEdit Lite text editor is free; and plenty of communications software is free. That's all you need for a functional coding platform. If you can afford NEXIS, you can afford a Macintosh.

Any operating system port is going to introduce bugs, and in the past six months, despite over 1000 hits on the Penn State Event Data Project web site, we've only had two offers for beta-testing. Any Windows version is going to be buggy for a long, long time.

Meanwhile, the more I read about programming for Windows, the less I'm looking forward to that prospect. Macintosh code is demonstrably stable across multiple upgrades of the operating system; Windows code seldom seems to survive even upgrades of the device drivers. SPSS, a multi-million dollar company, just decided to stop supporting multiple platforms (dropping the Mac) -- should our little project do otherwise?

So you still really, really, really want to do your machine-coding project in Windows? FRED -- Doug Bond's event coder -- runs in Windows. Pricing for FRED is unclear, but then SPSS is no bargain either. But maybe you'd like to port the source code? Read on.

If anything needs conversion, it's the Pascal source code

The Penn State Event Data Project source code -- all 16,000 lines of it -- is almost a decade old, and it was written when 2Mb of RAM was considered a lot of memory. More problematic, the Penn State Event Data Project is written in Pascal, which clearly has been superceded by C/C++.

As an instructor who has taught both Pascal and C (2), I recognize the advantages of the simplicity of Pascal. When the KEDS project began, Pascal was more of a standard in the world of personal computers than C -- for example the original Mac operating system was primarily written in Pascal.

But most current software development (including all Penn State Event Data Project utilities) is now done in C or C++. And like most programmers, I love C. ANSI C has standardized the language and, more relevant to our immediate problem, there is virtually no cross-platform translation software written for Pascal. Earlier I had decided to convert KEDS using Borland's Delphi system. Delphi 2 uses Pascal as the coding language, but Delphi 3? -- C++. In fact I can't even compile the Penn State Event Data Project to the Power PC chip: Symantec's Think Pascal was never converted to handle the PPC and CodeWarrior's Pascal string functions work slightly differently than Think's. (3)

Operating systems that run on both Power PC and Intel chips are becoming available -- BeOS, Apple's Rhapsody, and of course Linux. Automated cross-system translation software, notably Metrowerks' Latitude, is also becoming available. But all of these support only C/C++, not Pascal. We've got a serious dead-end here. Even if the Pascal version of the Penn State Event Data Project is converted to Windows 95 via Delphi, we are essentially maintaining a legacy system -- in fact because we are running MC680x0 code on the Mac, two legacy systems. Not cool.

Do I relish the thought of translating 16,000 lines of code from Pascal to C++? Nope. Do I relish the thought of rewriting the Penn State Event Data Project based on lessons learned from ten years of experience in machine coding? Maybe.

Envision a new Penn State Event Data Program with

  • 10 to 20 times the speed
  • systematic facilities for handling verb forms and synonyms
  • an "attribution engine" to handle SAID
  • proximity limitations in patterns

Perhaps this would be worth the effort.

[ Macintosh | Source code | Evil empire | Proposal | Footnotes | Email | Postscript,1999 ]

"Each individual has a personal duty to resist the evil empire." -- Dr. Paul Johnson, The Political Methodologist 8,1:36

The recent actions of Gates et al are getting decidedly irritating, if not frightening. (4) Do we really want to see the computing world reduced to one and only one operating system? From a corporation with the business plan of John D. Rockefeller and the public relations of Barney the Dinosaur? (5) Do you like your cable TV company? -- you'll just love a situation where everything from your personal computer (6) to your food processor is running on Windows NT. Isn't this exactly what the whole personal computer revolution was supposed to be getting away from?

[It is, however, amusing to observe that barely a week goes by without an article in InfoWorld from corporate MIS types bemoaning the shrinking number of choices they have available. By 1999, it's going to be Intel and NT, period. Terrible situation. I agree, but who is responsible for this state of affairs? Who banned Macintoshes from the corporate workplace? Who said Unix was too esoteric? As a noted 19th century social scientist pointed out, history repeats itself, first as tragedy, and then as farce. We're in the farce stage now. Stew in your own juice, fellows...(7)]

Move to the academic environment, and it gets really scary. In response to the possibility of huge grants from Intel, and supposed savings from supporting a single (difficult to support...) platform, universities across the country are banning everything except Intel-based hardware and you-know-what operating system. At my own cherished institution, there is even a proposal to require all students to learn Microsoft Office for Windows -- not word processing, not spreadsheets, not even basic computing, but MS-Office. Students, to their credit, have nothing but ridicule for this idea.

I have no illusions that Apple, as a company, is a great success story. Apple, as a company, squandered one of the greatest opportunities in history for building commercial prosperity through a technological breakthrough. But the Macintosh operating system continues to provide a far more elegant and adaptable solution to most computing problems than any variant of Windows, and more importantly continues to provide a source of innovation from which Microsoft can steal. It is therefore worthy of support, warts and all. Apple ain't dead yet. (8)

So, will not porting the Penn State Event Data Project to Windows stem this irrevocable tide of monopoly dominance? Not by itself. But it can't hurt. When the survey from the computer center comes around (9) and asks whether your department is using any software that absolutely requires a Mac, the answer is yes. Every Windows user who does research with the Penn State Event Data Project is another person who has learned the Mac, and who may see through the glossy Wintel propaganda machine and decide for themselves which system is easier to use.(10)

As the rest of the world reels under the assault of "No one ever got fired buying Microsoft" -- the fear-uncertainty-&-dread motto once employed by legions of IBM salesmen in identical suits (11) -- programmers in the academic world have an obligation to preserve some semblance of independence. Against IBM, that independence gave the world Unix; against Intel, RISC; and against Microsoft...well, dammit, at least we've got to try!

[ Macintosh | Source code | Evil empire | Proposal | Footnotes | Email | Postscript,1999 ]

A Modest Proposal

So, where do we go from here? We are considering the following changes in the future of the Penn State Event Data Project project:

  • 1. The complete Pascal source code for the Penn State Event Data Project will be available on the web site on 1 July 1998. If you want to port it to Windows, ahlan wa-sahlan. (12) It's got pretty good internal documentation, and there's always the manual. If you have a compelling need for the code prior to that point and are willing to make the port available on the web site, contact me.
  • 2. After completing a couple of other backlogged projects, I will be doing a port of the Penn State Event Data Project to C++ with the intention of making the compiled program available on as many platforms as can be maintained using automated porting tools such as Latitude. The new Penn State Evetn Data Project will be backwards-compatible with existing dictionaries -- you will be able to run the program with the dictionaries, but the new program will not support all of the features.

Okay, what you do think? Send comments to schrodt.parusanalytics.com.

Postscript, February 1999

An assortment of observations since this was written a year ago:

  1. 1600 visitors to the web site last year and no one has asked for that source code. It's available, honest. It's documented, sort of. It's Pascal, definitely.
  2. Unfortunately, some cracks are beginning to appear in the compatability of the Penn State Event Data Project and the latest Macintosh OS releases (8.0 and 8.5) -- it looks like multiple invocations of the standard file dialog might be causing problems. So far it doesn't seem to crash the machine until the second time through. Still, not good.
  3. Well, Bill and his minions haven't fared too well over the past year, eh? NT 5.0 is late, bloated and buggy (and renamed). Bill Gates popularity rivals that of Kenneth Starr. And meanwhile Apple, under iCEO Steve Jobs, has staged an incredible comeback.

    The "banish Apple" tide in academia seems to have ebbed -- for example our Provost backed off, appealing to Mac-owning faculty (I'm not making this up...) "Don't take me to some dark corner and shoot me -- you can keep your Macs!" However, we seem to have a generation of graduate students who know only Windows -- 100% of our entering class this year used Windows.

  4. Linux has emerged as a clear alternative for research software. It'll run on anything -- heck, Linux will run on a PalmPilot -- and it is famously stable. If the Penn State Event Data Project is upgraded, the reference operating system will be Linux.
  5. That upgrade is inching closer, but with no firm deadlines (we're also looking for some public funding for this). Any new version will be a totally new program, not simply a port to a new language and operating system. In all likelihood, it will use a separate, public- domain parser and implement a coder that operates on parsed representations of a text.

    The Power PC, C versions of our utilities run like the proverbial bat-out- of-hell compared to their MC680x0, Pascal counterparts -- I've seen speedups as high as a factor of 50 -- so a new program is likely to be much faster as well as more accurate.

    We're unlikely to provide for strict backwards compatability -- for example dictionaries for the new system will require parts-of-speech information ("FORCE-verb" vs. "FORCE-noun") -- but there will be tools for easing the transition. After all, I use this program in my own research.

    Finally, next time around the source code will be available from the beginning of the project and can be modified under Linux-style "open source" norms.

    Update: December 1999 -- We got the funding, and the program is going to be available by late March 2000. See the TABARI page. The early version will be totally compatible with Penn State Event Data Project dictionaries; later versions will likely begin to diverge.

[ Macintosh | Source code | Evil empire | Proposal | Footnotes | Email | Postscript,1999 ]

Footnotes:

1. Although you'd never know this from the popular press, including a number of publications that are edited and formatted using Macintosh systems.

2. and BASIC and COBOL and Algol and FORTRAN and the CDC Compass assembly language.

3. The Penn State Event Data Project makes quite a few calls to Pascal string functions...

4. Followers of our web site can confirm that the Penn State Event Data Project was opposing Microsoft years before this became popular. Heck, backing the underdog is a long and popular tradition out here in Kansas, home of Alf Landon, Earl Russell Browder, and Robert Dole.

(Basketball is another story, but let's not discuss religion.)

5. Although recently Microsoft has been conducting itself with the public relations of John D. Rockefeller and the technological acumen of Barney the Dinosaur.

6. Make that your networked computer that continually checks in with Redmond and reports whether you've installed any non-conforming software. Paranoid? -- ask Netscape.

7. Don't get me wrong -- there is a place for Microsoft Windows/Office. If I were managing a network for 237 undertrained and underpaid personnel who had the combined cognitive capacity of a hibernating muskrat, I'd insist they use Windows/Office too. To paraphrase Hal Hardenberg's famous characterization of COBOL, "There is a need for Windows, just like there is a need for maggots. But we don't have to like either, and we don't."

8. By providing not-on-Windows software, I am also supporting that other great holdout against the Microsoft onslaught: Linux. Beware: the MIS drones who ban Macintoshes will go after Unix next (particularly Linux which -- god forbid -- is supported entirely by its user community).

9. On those rare instances where Mac-trashing academic bureaucrats actually do a survey.

10. Curiously, we have found that the single greatest impediment that Windows users have when using the Penn State Event Data Project is their assumption that the Mac OS will be as complicated as Windows.

11. Yes, they were all men. Of course, once we hated IBM, but now IBM is an ally -- bring on those 1-gigahertz PPC chips, Big Blue! Those chips won't run Windows, but they'll run the Penn State Event Data Project (in MC680x0 emulation, at least). (Hmmm, well, I suppose they will run Windows in emulation as well. So we gotta do that C++ port to keep the technological edge, right?)

12. "Please go ahead."

[Macintosh | Source code | Evil empire | Proposal | Footnotes | Email | Postscript,1999]

Email and commentary

"S-U-R-R-E-N-D-E-R-,-P-H-I-L-I-P"
B.G., Redmond, Washington, USA

Response: Yeah, right, and get those flying monkeys out of my office. Where's that bucket of water??

Who the heck is Earl Russell Browder?
J.E.H., Washington DC, USA

Response:This Wichita-born Presidential candidate ran against Franklin Roosevelt in 1936 and 1940 on the American Communist Party ticket. He lost.