Скачать презентацию Python with COM Get at your Office Data Скачать презентацию Python with COM Get at your Office Data

18d0312435a516ca6b206b42053c50fc.ppt

  • Количество слайдов: 56

Python with COM Get at your Office Data 10 -Nov-98 Python with COM ~ Python with COM Get at your Office Data 10 -Nov-98 Python with COM ~ Christian Tismer 1

Contents 1. Introduction 2. Using COM with Python 3. Accessing Excel Data 4. Representing Contents 1. Introduction 2. Using COM with Python 3. Accessing Excel Data 4. Representing data tables in Python 5. Reading data from Access databases 6. Reading Word tables 7. Processing of data in Python 8. Creating results A. Supplemental 10 -Nov-98 Python with COM ~ Christian Tismer 2

1 Introduction 4 Foreword 4 Prerequisites 4 Short overview on data management 4 whetting 1 Introduction 4 Foreword 4 Prerequisites 4 Short overview on data management 4 whetting your appetite: some online examples 4 DM problems handled in this session 4 other DM problems one should know of 4 Specific goals of this session *collect data from different sources, convert them into a suitable structure, modify them and put them back into some other form 10 -Nov-98 Python with COM ~ Christian Tismer 3

1. 1 Foreword 4 Some words. . . (1 of 2) 4 The following 1. 1 Foreword 4 Some words. . . (1 of 2) 4 The following tutorial is on Data Management on the Windows platform. The main target is interaction with Office Objects like Word tables, Excel sheets and Access tables and queries. Therefore, the Win 32 COM interface plays a central role, which led to inclusion of COM into the tutorials title. 4 Nevertheless, my primary intent is to give you support, practice and a row of tools, to make your data processing tasks more effective. You will get working examples on how to explore data, how to convert it, find a better shape, insert it into your result set, optimize for speed, make your data fly. 4 You will not get complete applications to handle specific tasks. Instead, you get working code pieces and modules from production code, together with hints, tips, work-arounds, tricks, whatever can be squeezed out of a single person in 31/2 hours. 4 The majority of materials has been prepared on transparencies. I will probably publish them as a preview on the Internet, in order to let you prepare specific questions in advance, which are not covered yet. Hints by email are welcome. 10 -Nov-98 Python with COM ~ Christian Tismer 4

1. 1 Foreword 4. . . before we begin. (2 of 2) 4 The 1. 1 Foreword 4. . . before we begin. (2 of 2) 4 The whole course material will be handed out to you on a CD-ROM. This includes the current Python distribution, all source code, also modules which are not in the public domain and contain a copyright statement. 4 Attendees of this tutorial gain the right to use our modules for any purpose at home or in their company in order to increase their productivity. 4 Excluded is the right of publishing, selling or distributing original or modified versions of our software, or bundling it as part of a product. Companies which wish to do so need to contact Professional Net Service about conditions. 10 -Nov-98 Python with COM ~ Christian Tismer 5

1. 2 Prerequisites 4 In order to make use of the materials which are 1. 2 Prerequisites 4 In order to make use of the materials which are handed out to you in the tutorial, you need the following equipment: PC workstation running Windows 95, 98, NT 4. 0 or up Not below 32 MB of main memory CD-ROM drive Python 1. 5. 1 Win 32 all-122. exe Office Professional 97 or 98 (including Access) 10 -Nov-98 Python with COM ~ Christian Tismer 6

1. 3 On Data Management 4 Data management is quite a large field. The 1. 3 On Data Management 4 Data management is quite a large field. The complete scope is not so well defined. A try to give an idea: A Data Management department is usually responsible to take care about data entry, data quality, data conversion, data verification, data access security, data storage security, preparation of raw data listings and basic statistics, data and report archival. This can be extended / simplified to “everything necessary, until a statistician’s work can begin”. Data Managers are more and more confrontated with new evolving data formats, multiple data systems being used in parallel, and much increased demands on the ouptut quality. Providing simple text files as a report is in most cases insufficient. Modern Office tools have set new standards on what can be expected and force the Data Manager to not only produce data, but also present them in a convenient outfit. 4 You can save a lot of time for all of the above using Python with COM 10 -Nov-98 Python with COM ~ Christian Tismer 7

1. 4 DM problems 4 DM tasks tackled in this session: Data conversion Data 1. 4 DM problems 4 DM tasks tackled in this session: Data conversion Data transport across applications Finding a model for given data Exploring an unknown set of data Techniques for import and export Common data formats 10 -Nov-98 Python with COM ~ Christian Tismer 8

1. 5 Other DM problems 4 DM tasks one should have heared about: Verification 1. 5 Other DM problems 4 DM tasks one should have heared about: Verification of transport by closing the loop Data entry and comparison Generating raw data listings Generating full reports with basic statistics Design and management of databases Data archival and retrieval Coping with customer defined structures : -( 10 -Nov-98 Python with COM ~ Christian Tismer 9

1. 6 Specific goals of this session 10 -Nov-98 Python with COM ~ Christian 1. 6 Specific goals of this session 10 -Nov-98 Python with COM ~ Christian Tismer 10

2 Using COM with Python 4 How to access a COM object 10 -Nov-98 2 Using COM with Python 4 How to access a COM object 10 -Nov-98 4 How to create and read an interface Python with COM ~ Christian Tismer 11

2. 1 Accessing a COM object 4 Create an interface identify which interface to 2. 1 Accessing a COM object 4 Create an interface identify which interface to use generate the Python interface figure out how to use it try to create an object >>> import win 32 com. client >>> e=win 32 com. client. Dispatch("DAO. DBEngine") >>> db=e. Open. Database("h: \ab\vergleich. mdb") >>> 10 -Nov-98 Python with COM ~ Christian Tismer 12

2. 2 Reading an Interface 4 Learn about the interface Buy books on the 2. 2 Reading an Interface 4 Learn about the interface Buy books on the remote application Use the application’s online help Read the generated Python code Try the interface from your Python shell >>> w=win 32 com. client. Dispatch("Word. Application") >>> w. Visible=1 >>> doc=w. Documents. Add() >>> doc. Range() >>> doc. Range(). Text="Hi SPAM 7" >>> 10 -Nov-98 Python with COM ~ Christian Tismer 13

3 Accessing Excel Data 4 Getting an Excel sheet into a Python table 4 3 Accessing Excel Data 4 Getting an Excel sheet into a Python table 4 working with ranges and attributes 4 Where to get help on my objects? 4 how do I get the data as it looks like? The hard way using COM the hard way using delimited (SDF) files 10 -Nov-98 Python with COM ~ Christian Tismer 14

3. 1 From Excel into Python 4 Getting an Excel sheet into a Python 3. 1 From Excel into Python 4 Getting an Excel sheet into a Python table • make sure to get accustomed to ranges • be careful with strings: they come as Unicode when reading multiple cells >>> xl=win 32 com. client. Dispatch("Excel. Application") >>> xl. Visible=-1 >>> wb=xl. Workbooks(1) >>> sh=wb. Worksheets(1) >>> for row in sh. Used. Range. Value: print row (L'Name', L'Age', L'Language', L'Salary') (L'Gates', 43. 0, L'Visual Basic', L'dooh') (L'Tismer', 42. 0, L'Python', L': -(') (L'Rossum', L'dunno, >42? ', L'Python', L'SPAM') >>> 10 -Nov-98 Python with COM ~ Christian Tismer 15

3. 2 Ranges and Attributes Many properties are themselves ranges >>> sh. Used. Range. 3. 2 Ranges and Attributes Many properties are themselves ranges >>> sh. Used. Range. Rows(1). Value ((L'Name', L'Age', L'Language', L'Salary'), ) >>> sh. Used. Range. Columns(1). Value ((L'Name', ), (L'Gates', ), (L'Tismer', ), (L'Rossum', )) >>> sh. Used. Range. Columns(1) Other properties are attributes >>> r=sh. Used. Range. Columns(1) >>> r. Font >>> r. Font. Size 10. 0 >>> r. Font. Size=20 >>> 10 -Nov-98 Python with COM ~ Christian Tismer 16

3. 3 Where to get help on my objects? 10 -Nov-98 Python with COM 3. 3 Where to get help on my objects? 10 -Nov-98 Python with COM ~ Christian Tismer 17

3. 4. 1 WYSIWYG data Part I The hard way using COM • Value 3. 4. 1 WYSIWYG data Part I The hard way using COM • Value property gives the true internal value • Text property gives the current text representation >>> r. Cells(2, 2). Number. Format="0. 000" >>> r. Cells(2, 2). Text '43. 000’ >>> r. Cells(2, 2). Value 43. 0 >>> You have to cycle through all the single cells to get at the formatted text Works, but is very slow 10 -Nov-98 Python with COM ~ Christian Tismer 18

3. 4. 2 WYSIWYG data Part II The hard way using SDF - very 3. 4. 2 WYSIWYG data Part II The hard way using SDF - very fast! • Excel exports WYSIWYG - You parse the output def split_delimited(s) : """split a delimited text file string into a list of tuples. Quotes are obeyed, enclosed newlines are expanded to tab, double quotes go to quotes""" # the trick to make this handy is to use as a placeholder eol = buffalo. findlinedelimiter(s[: 10000]) # guessing function parts = string. split(string. replace(s, "t", ""), '"') limits = (0, len(parts)-1) for i in range(len(parts)) : part = parts[i] if i%2 : part = string. replace(part, eol, "t") else : if not part and i not in limits: part = '"' parts[i] = part # merge it back txt = string. join(parts, "") parts = string. split(txt, eol) # now break by for i in range(len(parts)) : fields = string. split(parts[i], "") parts[i] = tuple(fields) return parts 10 -Nov-98 Python with COM ~ Christian Tismer 19

4 Representing data tables in Python 4 simple tables as with the Excel examples 4 Representing data tables in Python 4 simple tables as with the Excel examples 4 table wrapper class with named columns >>> tab [(1, 2, 3), (4, 5, 6), (7, 8, 9)] >>> import dataset >>> ds = dataset. Data. Set(["field 1", "field 2", "field 3"], tab) >>> ds Data. Set with 3 rows and 3 columns >>> ds. get. Field. Names() ['field 1', 'field 2', 'field 3'] >>> ds[-1]{'field 1': 7, 'field 3': 9, 'field 2': 8} >>> 10 -Nov-98 Python with COM ~ Christian Tismer 20

4. 1 Some Data. Set methods 4 4 4 4 4 ambiguous append. Columns 4. 1 Some Data. Set methods 4 4 4 4 4 ambiguous append. Columns append. Constant. Column cross. Tabulate de. Tabulate display. Column expand filter. By. Category filter. By. Column filter. By. Value. List flatten folded get. Column. Names, get. Field. Names get. Tuples 4 4 4 4 4 get. Unique. Column. Values guess. Column. Types has. Column insert item join notinlist reduced remove rename. Column, rename. Columns select. Columns sort. On. Column, sort. On. Columns split. By. Column. Values substitute. In. Column transform. By. Column union unique These methods are described in the dataset module. 10 -Nov-98 Python with COM ~ Christian Tismer 21

4. 2 A little Data. Set browser >>> import Py. Tabs, axsaxs >>> db 4. 2 A little Data. Set browser >>> import Py. Tabs, axsaxs >>> db = axsaxs. Accessor("h: /verwaltung/AB/Adressen/Adreßbuch. mdb") >>> ds = db. get. Data. Set("adressen") opening adressen reading records. . . This handy little tool is itself a COM 258 server which I wrote with Delphi in an >>> x=Py. Tabs. view. DS(ds) afternoon 10 -Nov-98 Python with COM ~ Christian Tismer 22

5 Reading data from Access databases 4 reading a table 4 inspecting table and 5 Reading data from Access databases 4 reading a table 4 inspecting table and field properties 4 creating queries dynamically 10 -Nov-98 Python with COM ~ Christian Tismer 23

5. 1 Reading an Access table Using the native COM interface >>> import win 5. 1 Reading an Access table Using the native COM interface >>> import win 32 com. client >>> e=win 32 com. client. Dispatch("DAO. DBEngine. 35") >>> db=e. Open. Database("h: /ab/vergleich. mdb") >>> rs = db. Open. Recordset("-finde-") >>> f = rs. Fields >>> names = map(lambda fld: fld. Name, f) >>> names ['Typ_3', 'Nummer', 'Finde', 'Bemerkung'] >>> rs. Move. First() >>> while not rs. EOF: . . . values = map(lambda fld: fld. Value, f). . . print values. . . rs. Move. Next() # never forget this one!. . . ['ABNORM', 0, 'Normal', None] ['ABNORM', 1, 'Abnormal', None] ['ACTTYP', 1, 'Mild', None] >>> 10 -Nov-98 This is usually written as "DAO. DBEngine". some machines seem to require ". 35". I believe this happens when no Office 95 was installed before, but I’m not sure. Python with COM ~ Christian Tismer 24

5. 1 Reading an Access table Using the axsaxs / dataset interface >>> import 5. 1 Reading an Access table Using the axsaxs / dataset interface >>> import axsaxs, dataset >>> db = axsaxs. Accessor("h: /ab/vergleich. mdb") >>> ds = db. get. Data. Set("-finde-") opening -findereading records. . . 241 >>> ds. get. Field. Names() ['Typ_3', 'Nummer', 'Finde', 'Bemerkung'] >>> ds[0] {'Bemerkung': None, 'Finde': 'Normal', 'Nummer': 0, 'Typ_3': 'ABNORM'} >>> ds. item(0) ('ABNORM', 0, 'Normal', None) >>> A dataset is a wrapper class around tabular data in Python. Axsaxs is a wrapper around DAO databases. 10 -Nov-98 Python with COM ~ Christian Tismer 25

5. 2 Accessing properties Access Table. Defs >>> import axsaxs >>> db=axsaxs. Accessor(r 5. 2 Accessing properties Access Table. Defs >>> import axsaxs >>> db=axsaxs. Accessor(r"h: abvergleich. mdb") >>> d=db. dao. DB >>> for td in d. Table. Defs: print td. Name -Finde-Med. List. AE_TAB (. . . ) Field properties >>> >>> 25 >>>. . . rs = d. Open. Recordset("AE_TAB") f=rs. Fields[0] f. Properties. Count for p in f. Properties: print p. Name Some can be changed by assignment 10 -Nov-98 Value Attributes Collating. Order Type Name Ordinal. Position Size Source. Field Source. Table Validate. On. Set Data. Updatable Foreign. Name Default. Value Validation. Rule Validation. Text Required Allow. Zero. Length Field. Size Original. Value Visible. Value Column. Hidden Column. Width Column. Order Decimal. Places Display. Control >>> Python with COM ~ Christian Tismer 26

6. 1 Reading Word tables 4 using COM (online with Word) >>> >>> >>> 6. 1 Reading Word tables 4 using COM (online with Word) >>> >>> >>> 1 >>>. . . . . >>> 11 >>> import win 32 com. client w=win 32 com. client. Dispatch("word. application") w. Visible=1 doc=w. Documents. Add("d: \ tmp\d. html") doc. Tables. Count this works with HTML, too! tbl = [] for row in range(1, 1+ len(doc. Tables(1). Rows)): line = [] for col in range(1, 1+len(doc. Tables(1). Columns)): try: line. append(doc. Tables(1). Cell(row, col). Range. Text) except: pass # exception for joined cells tbl. append(line) len(tbl) 10 -Nov-98 Python with COM ~ Christian Tismer 27

6. 2 Reading Word tables using Rich Text files (offline, RTF parser) # simple 6. 2 Reading Word tables using Rich Text files (offline, RTF parser) # simple class to get the text from RTF. # Especially to read tables in and get their values. import string, sys. path. insert(0, "c: \ab\python") import rtfpars class rtftext(rtfpars. rtfstream) : def __init__(self, fname) : rtfpars. rtfstream. __init__(self, fname) self. level = 0 def gettok(self) : code, val = self. gettoken() if code < 2 : self. level = self. level + code return code, val def skipuntil(self, target) : while 1 : code, val = self. gettok() if code == 0 : if not val or val in target : return val def skiphead(self) : self. skipuntil(["pard"]) 10 -Nov-98 def readuntil(self, target) : res = [] while 1 : tup = self. gettok() res. append(tup) code, val = tup if code == 0 : if not val or val in target: return res def getthing(self) : # a thing is a simple line or a table row. if self. level==0 : self. skiphead() line = self. readuntil(["par", "sect", "cell", "row"]) if (0, "intbl") not in line: return line buf = line if (0, "row") not in buf: buf = buf + self. readuntil(["row"]) cells = splitlist(buf, (0, "cell")) rest = cells[-1] del cells[-1] tok = rest[-1] del rest[-1] row = [] for cell in cells: row. append(splitlist(cell, (0, "par"))) row. append(rest) row. append(tok) return row Python with COM ~ Christian Tismer 28

6. 2 Reading Word tables (cont. ) Further examination is very data / problem 6. 2 Reading Word tables (cont. ) Further examination is very data / problem specific # helpers # later, we will have an own paragraph structure # uhhm, bad without one. hack. . . def gettext(para) : ret = [] for code, val in para: if code==2 : ret. append(val) elif val == "tab" : ret. append("t") return string. join(ret, "") def splitlist(lis, elem) : # splits list, but keeps the elem found at the end. res = [] while 1 : try : pos = lis. index(elem)+1 res. append(lis[: pos]) lis[: pos] = [] except Value. Error: res. append(lis) return res 10 -Nov-98 # the little app: get all data from tables def main(fname = "c: \ab\brivudin\513\urin\RE 14 -97 E. rtf") : global tables = [] rtf = rtftext(fname) intbl = 0 while not rtf. eof: row = rtf. getthing() if (0, "row") not in row : intbl = 0 pass # print gettext(row) continue if not intbl : tables. append([]) intbl = 1 textrow = [] for cell in row[: -2] : celltext = map(gettext, cell) textrow. append(string. join(celltext, "n")) tables[-1]. append(textrow) Python with COM ~ Christian Tismer 29

7 Processing of data in Python 4 reorganizing tables 4 exploring of data what 7 Processing of data in Python 4 reorganizing tables 4 exploring of data what is the contents what is the best datatype for this column? 4 Grouping operations 4 data normalization 10 -Nov-98 Python with COM ~ Christian Tismer 30

7. 1 Processing of data 4 reorganizing tables (1 of 3) A common task: 7. 1 Processing of data 4 reorganizing tables (1 of 3) A common task: de-tabulate data from many columns into a long one. Hard for Access or SQL, this is a cakewalk with a dataset. This is the raw data, prepared a little as an Access Query. The data columns must be reorganized into one clumn. Example taken from a huge Pharmaceutical project: Brivudin Oral had 26 large Access Databases with different structure. They were all harmonized and merged into one big Summary database. 10 -Nov-98 Python with COM ~ Christian Tismer 31

7. 1 Processing of data 4 reorganizing tables (2 of 3) Here an excerpt 7. 1 Processing of data 4 reorganizing tables (2 of 3) Here an excerpt from the transformation code. def move. ECG(limit=None): SRCT="ECG" # does all the blocks of data, just ECG for now #Var. ID, Subject. ID, Visit. No, Time. Stamp, Val, Ref. Range. ID print 'loading ecg data, wait a minute. . . ' ds 1 = Halle. get. Data. Set('_export. ECG', limit) ds 1. display() ds 2 = ds 1. select. Columns(['Subject', 'Visit', 'PQ', 'QRS', 'QT', 'HR']) ds 3 = ds 2. de. Tabulate(2) ds 4 = ds 3. transform. By. Column('Subject', global. ID) ds 5 = ds 4. rename. Column('Subject', 'Subject. ID') ds 6 = ds 5. rename. Column('Visit', 'Visit. No') ds 6. display() Here we go reshaping our data Summary. add. Measurements(ds 6, SRCT) 10 -Nov-98 Python with COM ~ Christian Tismer 32

7. 1 Processing of data 4 reorganizing tables (3 of 3) And a look 7. 1 Processing of data 4 reorganizing tables (3 of 3) And a look at the resulting table. . . 10 -Nov-98 Python with COM ~ Christian Tismer 33

7. 2 Exploring of data 4 What is the contents? >>> ds = db. 7. 2 Exploring of data 4 What is the contents? >>> ds = db. get. Data. Set("ae_tab") opening ae_tab 1 reading records. . . 217 >>> ds. get. Field. Names() ['Subject', 'Page', 'Row', 'OCCNO', 'AE', 'HARTS', 'SEVERITY', 'STRTDT', 'STOPDT', 'PATTERN', 'RELSHIP', 'NOACT', 'SMDINCR', 'SMDRED', 'SMDINTR', 'SMDDISC', 'NDTHER', 'CONALT', 'CONADD', 'HOSPITL', 'OUTC', 'SAE', 'OK'] >>> ds. get. Unique. Column. Values("SEVERITY") [None, 1, 2, 3] >>> 10 -Nov-98 Python with COM ~ Christian Tismer 34

7. 2 Exploring of data 4 What is the best datatype for this column? 7. 2 Exploring of data 4 What is the best datatype for this column? def guess. Column. Type(data) : # should go into Dataset perhaps. # This can be done much more sophisticated. # for now, we do simple heuristics. data = filter(None, data) if len(filter(is. Date. Time, data)) == len(data) : return "Date" data = map(str, data) needed = max(map(len, data)) if not data : return "Var. Char(60)" # try to reduce unnecessary floats data = map(lambda s: s[-2: ]==". 0" and s[: -2] or s, data) try : # now try to convert data = map(string. atoi, data) maxval = max(data) minval = min(data) maxval = max(maxval, abs(minval)) if maxval <= 65535: return “Integer" return “Long" except Value. Error: pass 10 -Nov-98 This function didn’t make it into dataset yet, since it is quite database dependant. try : data = map(string. atof, data) return "Double" except Value. Error: pass if needed > 255: return "Memo" elif needed > 60: return "Var. Char(255)" return "Var. Char(60)" Python with COM ~ Christian Tismer 35

7. 3 Processing of data 4 Grouping operations dataset. reduce(columnlist) squeezes all repeated records 7. 3 Processing of data 4 Grouping operations dataset. reduce(columnlist) squeezes all repeated records into one and turns the values in columnlist into lists. Dataset. expand does the inverse. This provides easy processing of groups of records. >>> ds = db. get. Data. Set("select subject, page, ae, severity from [ae_tab 2]") opening select subject, page, ae, severity from [ae_tab 2] reading records. . . 217 >>> ds Data. Set with 217 rows and 4 columns >>> ds 2=ds. reduce(["page", "ae", "severity"]) >>> ds 2. display(colwidth=20, maxrows=5) 'subject' ['page'] ['ae'] ['severity'] 1 [103, 132, 240, ['Diastolic pressure [1, 1, 7 [174, 191, ['Impatiences', 'Dro [1, 1, 8 [174, 191, 213, 305, ['Palpitations', 'Pa [1, 1, 1, 2, 9 [132, 152, 174, ['Hypersalivation', [2, 2, 10 [132] ['Decrease of diasto [1] >>> ds 2 Data. Set with 45 rows and 4 columns >>> ds 2. expand() Data. Set with 217 rows and 4 columns >>> 10 -Nov-98 Python with COM ~ Christian Tismer 1, 1, 1, 2, None 1, 1 2, 2 2, 1 36

7. 4 Processing of data 4 data normalization (1 of 2) with a few 7. 4 Processing of data 4 data normalization (1 of 2) with a few of the grouping operations, redundancy in tables can be analyzed. • • 10 -Nov-98 Group the columns with the contents by reduce insert a unique key column select master and detail datasets expand the detail dataset Python with COM ~ Christian Tismer 37

7. 4 Processing of data 4 data normalization (2 of 2) >>> >>>. . 7. 4 Processing of data 4 data normalization (2 of 2) >>> >>>. . . . >>> >>> ds=dataset. Data. Set(["nr", "nr 2", "data 1", "data 2"], []) import whrandom for nr in range(1, 5): nr 2 = nr*2 for k in range(whrandom. randint(1, 8)): data 1 = whrandom. randint(1, 1000) data 2 = whrandom. randint(1, 2000) ds. append((nr, nr 2, data 1, data 2)) ds 2 = ds. reduce(["data 1", "data 2"]) ds 3 = ds 2. append. Columns(ds 2. record. Range("key")) dsmaster = ds 3. select. Columns(ds 3. notinlist(ds 3. reduced())) dsdetail = ds 3. select. Columns(["key"]+ds 3. reduced()). expand() >>> dsdetail. display() 'key' 'data 1' 'data 2' 0 883 732 0 224 1853 0 889 1170 0 871 1581 1 763 453 1 867 1881 1 870 566 1 646 1509 1 645 612 2 150 1042 10 -Nov-98 >>> dsmaster. display() 'nr' 'nr 2' 'key' 1 2 0 2 4 1 3 6 2 4 8 3 >>> Python with COM ~ Christian Tismer 38

8 Creating results 4 creating result tables in Access 4 creating result tables in 8 Creating results 4 creating result tables in Access 4 creating result tables in Word 4 logging events in an Access table 4 fast writing mode for Access 4 producing cross tables beyond Access’ capabilities 4 formatting output in Word 10 -Nov-98 Python with COM ~ Christian Tismer 39

8. 1 Creating results 4 creating result tables in Access (1 of 2) def 8. 1 Creating results 4 creating result tables in Access (1 of 2) def make. Table. SQL(name, ds): fields = [] for cn in ds. get. Field. Names(): fields. append( "[%s] %s" % ( cn, guess. Column. Type(ds. get. Column(cn))) ) return "create table [%s] (%s)" % (name, string. join(fields, ", ")) >>> make. Table. SQL("master", dsmaster) 'create table [master] ([nr] Integer, [nr 2] Integer, [key] Integer)' >>> from brivtools import make. Table. SQL >>> make. Table. SQL("master", dsmaster) 'create table [master] ([nr] Integer, [nr 2] Integer, [key] Integer)' >>> db. exec. SQL(make. Table. SQL("master", dsmaster)) >>> db. exec. SQL(make. Table. SQL("detail", dsdetail)) >>> db. insert. Data. Set("master", dsmaster) inserting data 4 >>> db. insert. Data. Set("detail", dsdetail) inserting data 21 >>> 10 -Nov-98 Python with COM ~ Christian Tismer 40

8. 1 Creating results 4 creating result tables in Access (2 of 2) 10 8. 1 Creating results 4 creating result tables in Access (2 of 2) 10 -Nov-98 Python with COM ~ Christian Tismer 41

8. 2 Creating results 4 creating result tables in Word (1 of 2) c 8. 2 Creating results 4 creating result tables in Word (1 of 2) c = win 32 com. client. constants def appendtable(rows, columns) : myrange = doc. Range() myrange. Collapse(c. wd. Collapse. End) # sieh nach ob wir in einer Tabelle sind. # wenn ja, hänge einen Absatz an if myrange. Tables. Count: myrange. Paragraphs. Add() myrange. Collapse(c. wd. Collapse. End) tbl = myrange. Tables. Add(myrange, rows, columns) return tbl This is the straight-forward way to create a Word table: Add a table with rows and columns, and fill them cell by cell. Meanwhile you can brew coffee, or have a meal. . . def dstotableslow(ds) : nrows = len(ds)+1 ncols = len(ds. get. Column. Names()) tbl = appendtable(nrows, ncols) header = ds. get. Field. Names() cell = tbl. Cell for col in range(ncols): cell(1, col+1). Range. Text = str(header[col]) content = ds. get. Tuples() for row in range(len(content)): for col in range(ncols): cell(row+2, col+1). Range. Text = str(content[row][col]) return tbl 10 -Nov-98 Python with COM ~ Christian Tismer 42

8. 2 Creating results 4 creating result tables in Word (2 of 2) def 8. 2 Creating results 4 creating result tables in Word (2 of 2) def dstostring(ds): nrows = len(ds)+1 ncols = len(ds. get. Column. Names()) header = ds. get. Field. Names() content = ds. get. Tuples() lis = [string. join(map(str, header), "t")] for line in content: lis. append(string. join(map(str, line), "t")) lis. append("") return string. join(lis, "n") But in most cases, your data will most probably not contain TAB characters. This leads to a very fast solution which converts megabytes of table data into Word in a few seconds. def dstotable(ds): nrows = len(ds)+1 ncols = len(ds. get. Column. Names()) blob = dstostring(ds) if string. count(blob, "n") != nrows or string. count(blob, "t") != (ncols-1) * nrows: return dstotableslow(ds) # no specials, we can use the fast one. doc. Range(). Paragraphs. Add() myrange = doc. Range(). Paragraphs. Add(). Range myrange. Text = blob c = win 32 com. client. constants myrange. Convert. To. Table(Separator=c. wd. Separate. By. Tabs, Num. Columns=ncols, Num. Rows=nrows, Format=c. wd. Table. Format. None) return myrange. Tables(1) For the unlikely cases, we fall back to the slower method. 10 -Nov-98 Python with COM ~ Christian Tismer 43

8. 3 Creating results 4 logging events in an Access table a simple example 8. 3 Creating results 4 logging events in an Access table a simple example where status records are written into existing Access records def msg_to_subjectvisits(subject, visit, msg) : ddb = db. dao. DB rs = ddb. Open. Recordset('''select * from Subject. Visits where Pat_No = %d and Visit = %d''' %(subject, visit)) rs. Edit() rs. Fields("Python. Result"). Value = msg rs. Update() rs. Close() 10 -Nov-98 Python with COM ~ Christian Tismer 44

8. 4 Creating results 4 fast writing mode for Access reading an Access table 8. 4 Creating results 4 fast writing mode for Access reading an Access table with axsaxs is very fast since it is done in larger blocks. Writing is much more expensive since it always must happen recordwise. Early versions of Python’s COM interface were slow at attribute access, and an accelerator module gave speed gains of about 3. 8. The idea is to pick pre-bound functions which are applied later. Speed gain is meanwhile down to 1. 17, but still considerable. >>> import COMutil, speed. COM >>> speed. COM. Install(COMutil. find. Module("DAO. DBEngine. 35")) 10 -Nov-98 Python with COM ~ Christian Tismer 45

8. 5 Creating results 4 producing cross tables beyond Access’ capabilities Access can do 8. 5 Creating results 4 producing cross tables beyond Access’ capabilities Access can do crosstabs only on single fields. The dataset module can collapse multiple fields into a tuple, rotate that and unpack it again - giving multiple crosstabs. • Group name and value fields together with select. Columns • Fold multiple name fields into one • >>> ds = ds. fold(firstnamecol, num_of_cols) • Fold the same number of value fields into one • >>> ds = ds. fold(firstvaluecol, num_of_cols) • Do the crosstabulation • >>> dsx = ds. cross. Tabulate(namefield, valuefield) • Flatten the dataset, that’s it. • >>> dsx = dsx. flatten() • You still have to work on the column names a little 10 -Nov-98 Python with COM ~ Christian Tismer 46

8. 6 Creating results 4 formatting output in Word (1 of 3) class converter: 8. 6 Creating results 4 formatting output in Word (1 of 3) class converter: def __init__(self, factor, divisor=1): if factor != 1 and divisor != 1: self. factor = float(factor) self. divisor = float(divisor) self. operation = self. scale elif divisor != 1 : self. divisor = float(divisor) self. operation = self. divide elif factor != 1 : self. factor = float(factor) self. operation = self. multiply else: self. operation = self. noop def def def multiply(self, arg): divide(self, arg): scale(self, arg): noop(self, arg): __call__(self, arg): return return arg * self. factor arg / self. divisor arg * self. factor / self. divisor arg+0. 0 self. operation(arg) Picas. To. Points = converter(12) Points. To. Picas = converter(1, 12) Inches. To. Points = converter(72) Points. To. Inches = converter(1, 72) Lines. To. Points = converter(12) Points. To. Lines = converter(1, 12) Inches. To. Centimeters = converter(254, 100) Centimeters. To. Inches = converter(100, 254) Centimeters. To. Points = converter(Inches. To. Points(100), 254) Points. To. Centimeters = converter(254, Inches. To. Points(100)) 10 -Nov-98 Python with COM ~ Christian Tismer Python. COM still has some probs with some global functions of Word. Here a little surrogate class which is not only useful for Word. 47

8. 6 Creating results 4 formatting output in Word (2 of 3) True = 8. 6 Creating results 4 formatting output in Word (2 of 3) True = -1 False = 0 def simpleformat(tbl, w 1=3, w 2 = 3. 5): """formats a 3 -columned table arbitrary""" c = win 32 com. client. constants tbl. Columns(1). Set. Width(Centimeters. To. Points(w 1), c. wd. Adjust. Proportional) if tbl. Columns. Count >= 2 : tbl. Columns(2). Set. Width(Centimeters. To. Points(w 2), c. wd. Adjust. Proportional) tbl. Rows(1). Heading. Format = True myrange = tbl. Rows(1). Range myrange. Font. Bold = True fmt = myrange. Paragraph. Format fmt. Space. Before = 3 fmt. Space. After = 3 return tbl Now for a simple formatter which does some very few changes. Next page we look at the result. . . 10 -Nov-98 Python with COM ~ Christian Tismer 48

8. 6 Creating results 4 formatting output in Word (3 of 3) 10 -Nov-98 8. 6 Creating results 4 formatting output in Word (3 of 3) 10 -Nov-98 Python with COM ~ Christian Tismer 49

A Supplemental 4 how to find the right Access version of a . mdb A Supplemental 4 how to find the right Access version of a . mdb file 4 ODBC data sources and ADODB 4 Running your data through SAS 4 how to compress all your Access databases overnight 10 -Nov-98 Python with COM ~ Christian Tismer 50

A. 1 Supplemental how to find the right Access version of a. mdb file A. 1 Supplemental how to find the right Access version of a. mdb file • this is a common problem: Access shows up and offers to convert your database. Buf you want to open it with the right Access version instead. Solution: acc 278. py does what it’s name says. Usage: After installation, you can double-click any. MDB file from the Explorer, and the according Access version will be loaded. Only this action is intercepted. Opening from the file menu works as usual. Installation: - Copy this file into your python directory. - Edit the path settings for your Access executables. - From Explorer, open the File Type settings for. MDB and change the "open" method as follows: d: pythonw. exe d: pythonacc 278. py "%1" (adjust paths accordingly) DAO's "Version" method for Access 7/8 won't help, since the databases are identical in structure. Only the objects which Access stores in the database have changed. An undocumented feature which works for all my databases is a Property "Access. Version" which is stored in a database property collection. We use DAO to read the version number and fire up the right MSACCESS. EXE version. 10 -Nov-98 Python with COM ~ Christian Tismer 51

A. 2 Supplemental 4 ODBC data sources and ADODB this chapter is in preparation. A. 2 Supplemental 4 ODBC data sources and ADODB this chapter is in preparation. The “axsaxs” module meanwhile has an “axsado” companion which works with several databases, like MS SQL server 6. 5, and it will support ODBC sources as well. The module will be ready in a few weeks. 10 -Nov-98 Python with COM ~ Christian Tismer 52

A. 3 Supplemental 4 Running your data through SAS this chapter is in preparation. A. 3 Supplemental 4 Running your data through SAS this chapter is in preparation. The SAS module is under development and will be ready in a few weeks. 10 -Nov-98 Python with COM ~ Christian Tismer 53

A. 4 Supplemental 4 how to compress all your databases overnight (1 of 3) A. 4 Supplemental 4 how to compress all your databases overnight (1 of 3) # # # 19980113 recursion compact CT 971215 compresses MS Access databases and keeps the date info. import dao 3032, os, stat, string, sys infofile = ". compactinfo" def compact(fname) : e=dao 3032. DBEngine() x=os. stat(fname) tim=(x[7], x[8]) # == stat. ST_ATIME, stat. ST_MTIME fneu = fname+". neu" e. Compact. Database(fname, fneu) os. utime(fneu, tim) os. unlink(fname) os. rename(fneu, fname) x=os. stat(fname) return x[stat. ST_SIZE] 10 -Nov-98 Python with COM ~ Christian Tismer 54

A. 4 Supplemental 4 how to compress all your databases overnight (2 of 3) A. 4 Supplemental 4 how to compress all your databases overnight (2 of 3) def get_tree(path, retlist=None) : if retlist is None: retlist=[] dirs = [] ; known = {} try : for row in open(os. path. join(path, infofile)). readlines() : fields = eval(row) if len(fields) == 2 : known[fields[0]] = fields[1] except : pass for entry in os. listdir(path) : fullname = os. path. join(path, entry) x=os. stat(fullname) mode = x[0] # == stat. ST_MODE if stat. S_ISDIR(mode) : print "Directory: %s" %entry dirs. append(entry) elif stat. S_ISREG(mode) and string. lower(entry)[-4: ]==". mdb": print "Database %s" % entry if not (mode & stat. S_IWRITE) : print 10*'*', 'SCHREIBGESCHÜTZT', 10*'*' continue if known. has_key(entry) and known[entry]==x[stat. ST_SIZE] : print 10*'+', 'Größe ist unverändert', 10*'+' continue 10 -Nov-98 Python with COM ~ Christian Tismer 55

A. 4 Supplemental 4 how to compress all your databases overnight (3 of 3) A. 4 Supplemental 4 how to compress all your databases overnight (3 of 3) retlist. append(x[stat. ST_SIZE], fullname) for subdir in dirs: newpath = os. path. join(path, subdir) print "recursing into %s" % newpath get_tree(newpath, retlist) print "back from %s" % newpath return retlist def compactall(path) : print "searching tree %s" % path worklist = get_tree(path) print "sorting by size, smallest first" worklist. sort() for (size, database) in worklist: print "compacting %s (%d)" % (database, size), sys. stdout. flush() newsize = compact(database) logpath, entry = os. path. split(database) logpath = os. path. join(logpath, infofile) open(logpath, "a"). write(repr((entry, newsize))+"n") print "- %d (ratio=%0. 2 f)" % (newsize, (0. 0+newsize)/size) print "done compacting all of %s" % path 10 -Nov-98 Python with COM ~ Christian Tismer 56