c9787d32eb9ee69d6682257a94b1d32d.ppt
- Количество слайдов: 1
Corpora: From magnetic tape to web access Knut Hofland, hofland@uib. no, icame. uib. no/history/poster. ppt UNIFOB Aksis, Bergen, Norway Brown Corpus was made from 1961 -64 12. 02. 1977: ICAME founded in Oslo Development of computers 1970 -2009 29 -30. 03. 1979: First ICAME conference in Bergen 1977 -79: ICAME News started March 1978 Converted the Brown Corpus from original punched card format to a more readable format and corrected errors found during the tagging of the corpus (from 1971 -78). **R**T *THE *FULTON *COUNTY *GRAND *JURY SAID *FRIDAY AN INVESTIGATION OF *ATLANTA**AS RECENT PRIMARY ELECTION PRODUCED **QNO EVIDENCE**U TH AT ANY IRREGULARITIES TOOK PLACE. **R**T *THE JURY FURTHER SAID IN TER M-END PRESENTMENTS THAT THE *CITY *EXECUTIVE *COMMITTEE, WHICH HAD OVE R-ALL CHARGE OF THE ELECTION, **QDESERVES THE PRAISE AND THANKS OF THE *CITY OF *ATLANTA**U FOR THE MANNER IN WHICH THE ELECTION WAS CONDUCT ED. **R**T *THE *SEPTEMBER-*OCTOBER TERM JURY HAD BEEN CHARGED BY *FUL TON *SUPERIOR *COURT *JUDGE *DURWOOD *PYE TO INVESTIGATE REPORTS OF PO SSIBLE **QIRREGULARITIES**U IN THE HARD-FOUGHT PRIMARY WHICH WAS WON B Y *MAYOR-NOMINATE *IVAN *ALLEN *JR**. . **R**T **Q*ONLY A RELATIVE HAND A 01 A 01 A 01 A 01 0010 1 0020 9 0030 5 0040 3 0050 2 0050 11 0060 11 0070 9 0080 8 0090 6 0100 5 1970 s Mainframe computers: Univac, IBM, ICL 1971 Floppy disk (diskette) 1975 Altair 8800 Personal computer 1976 Apple I 1977 Apple II 1978 Visi. Calc, spreadsheet 1979 Word. Star, word processing software 1980 Seagate 5. 25” 5 MB hard disk 1981 IBM PC (4. 77 MHz, 16/64 k. B RAM, 160 k. B 5. 25” diskette, MS-DOS, CGA) 1982 Commodore 64 1983 IBM PC XT (128 k. B RAM, 10 MB HD, 360 k. B diskette) 1983 Apple Lisa, first GUI interface 1984 Apple Macintosh (128 k. B, 400 k. B 3. 5” diskette) 1984 First HP Laserprinter (Apple Laser. Writer PS 1985) 1984 IBM PC AT (286 6 -10 MHz, 20 MB HD, 256 k. B RAM, 1. 2 MB diskette, EGA) 1984 MS/DOS 3. 1 1985 Windows 1 1985 Philips CM-100 CD-ROM (Apple 1988) 1987 PS/2 (386 8 -20 MHz, 640 k. B RAM, 1, 44 MB 3. 5”, 20 -70 MB HD, VGA) 1990 World Wide Web, text version 1990 Typical PC: 486 25 MHz, 4 MB RAM, 150 MB HD 1992 Windows 3. 1 1993 Mosaic graphic web client 1994 MS/DOS 6. 0 1995 Windows 95 1997 Typical PC: Pentium II 233 MHz, 64 MB RAM, 4 GB disk 2001 Windows XP 2007 Windows Vista 2009 Portable PC: Dual Core 2. 2 GHz, 4 GB RAM, 400 GB HD 2009 Desktop PC: Quad Core 2. 6 GHz, 16 GB RAM, 1000 GB HD 0010 E 1 A 01 0020 E 1 A 01 0030 E 1 A 01 0040 E 1 A 01 0050 E 1 A 01 0060 E 1 A 01 0070 E 1 A 01 0080 E 1 A 01 0090 E 1 A 01 0100 E 1 A 01 The Fulton County Grand Jury said Friday an investigation of Atlanta's recent primary election produced "no evidence" that any irregularities took place. The jury further said in term-end presentments that the City Executive Committee, which had over-all charge of the election, "deserves the praise and thanks of the City of Atlanta" for the manner in which the election was conducted. The September-October term jury had been charged by Fulton Superior Court Judge Durwood Pye to investigate reports of possible "irregularities" in the hard-fought primary which was won by Mayor-nominate Ivan Allen Jr&. LOB Corpus was finished in Oslo/Bergen in 1979. Concordances were made to both Brown and LOB Corpus. The texts and concordances were distributed on magnetic tape and microfiche. One fiche = 207 pages (each with 72 lines with 132 columns). The LOB concordance contained frequency counts from the Brown Corpus. The LOB KWIC used 100 fiches. London-Lund corpus was distributed on tape. Moores law: transistor count doubling every two year 1981: London-Lund KWIC concordance available on tape. 1982 -1985: POS-tagging of LOB in Lancaster and Bergen (CLAWS 1, Constituent Likelihood Automatic Wordtagging System). Word list and suffix list for look-up were based on the tagged Brown Corpus. Text and concordance available on tape. 1987: Melbourne-Surrey Corpus available (100 K word newspaper text). ICAME News -> Journal. A version of Brown Corpus indexed by the MS-DOS program Word. Cruncher was made by Randall L. Jones from Brigham Young University (11 MB including index files). The index was so efficient that the program could be used on a standard IBM PC XT/AT. Distribution on diskettes started. Kolhapur Corpus (Indian English) and Lancaster Spoken English corpus were added to the collection. A mail-based infoserver was started (FAFSRV at NOBERGEN, EARN/BITNET). Word. Cruncher logo 1990: Polytechnic of Wales Corpus. 1992: Lancaster Parsed Corpus, Corpora list started. FTP info-server. Gopher server in 1993. ICAME CD-ROM collection, version 1. Contained Brown, LOB, Kolhapur, London_Lund and Helsinki Corpora, all indexed by Word. Cruncher. Macintosh/Unix version of the texts. Texts also indexed by MS-DOS program TA CT. 1995: Newdigate newsletters, ICAME web-site, 900 members on Corpora list 2000: ICAME CD-ROM, version 2, COLT CD-ROM with sound files, Internet search for holders of the CD-ROM to the main corpora. 2009: More than 3000 members on the Corpora list. Content of ICAME CD, version 2: Future: More material, new CD/DVD More corpora searchable on Internet Part of CLARIN (www. clarin. eu) Some statistics 1980 = 5 MB, 2009 = 1000 MB
c9787d32eb9ee69d6682257a94b1d32d.ppt