36af2bccd3b63fff0f0f783f47d6ec56.ppt
- Количество слайдов: 15
MTGO Economy Analyzer By Jon Jelinek
Overall goals for this project • Pulling the buying and selling data from the magic online game. • Parse out whether or not the card was being bought or sold, the card name and the price. • Have this service run continuously, and provide a web page where users could search for a given card name, and have the buy / sell values returned to them in a useful form. (Charts and listing of data points. )
Step 1: Pulling the buying and selling data from the magic online game. The main source of relevant price information exists on the Classifieds section of the game. It is here that any player can post a classified ad to try and buy or sell mtgo products “cards”. Which ends up looking something like this to everyone else in the world.
Automated Software for ripping some screens I used Auto. IT, a feature rich scripting language that is especially good at automating tasks in the microsoft windows environment. Some features that swayed me to use the Auto. IT include: check. Pixel. Sum() - takes an area on the screen and returns the checksum value of the image in that region. My thought was remember what the checksum values were for the letters on the screen, and build my text output from that library of checksums. I was also interested in trying out the tesseract OCR wrapper for Auto. IT.
Auto. IT results: pixel. Check. Sum() was not my friend. While blatantly obvious now, the background colors/images used by magic online make gathering pixel. Check. Sum for a given region of the screen ( a given letter’s region ) worthless. Example: Another issue with my auto. IT scripts was I had setup my first set of scripts to read off two lines of text, from a saved screenshot. The real-time Classifieds section does not guarantee that I will have my text on the same line, it could be off by a few pixels, and that would throw off all of my checksum values. In the end I wrote 15 auto. IT scripts for various tasks. The scripts ranged in size from 1 K to 17 K. The 17 K script was 550 lines of code. I only emphasize this point because I spent a majority of my time fighting with the auto. IT scripting language. In the end, the auto. IT application could not handle doing 1. 2 million checksums per screen. I wrote a smaller script to automatically click down the scrollbar and save off screenshots to a directory that Java would read in. On to Java…
Java tries its luck on the screenshots There were a very key tasks I wanted Java to perform for this project: Clean up the screenshots by eliminating any backgrounds, and then run the image through an OCR and see if the output is better. Using the OCR engine with Auto. IT resulted in unreadable text. I also wanted the Java end of the project to run continuously. Feeding on new screenshots as they hit a given directory, processing the image into useful text, dropping it into a database, and finally parsing that bulk text into intelligent data. …At least that’s what I hoped to accomplish.
Java tries its luck on the screenshots Over a 1000 lines of code later and the project is not done because of character recognition problems. I’ve learned that making your own character recognition software is REALLY HARD, and the most popular open source solutions are not that great. Estimated project status for each component below: 100% Auto. IT scripts 100% DB model in place, waiting on quality data. 70% Character recognition is holding me back. Cleaned images 100% (Java) 50% Web front end. AJAX query and db link working. Chart code needs work. I didn’t want to work on this until I actually had good data.
Output from Tesseract OCR OPENTrad¤ng, Buyung and Sawng SOM, M 11, ROE, ZEN, ININK, ALA, GON, ARB cards Many MITHIG and EOIL cards, quad pr¤n¤s* Na. L¤m¤* Human Sawng Nwssa Ravana TBI Name Huararnn W Knwgnt ufma Ramuary ISI SMARTBOT SELLING Warm >·<arv 2 st¤ 5 Heartmendzrm BI Rustrazur Bmnnzrm ¤ 2 Turpur Dustm ¤ 2 Umbra Mysunm UBI /O¤na‘s Gatewardanm BSI /Ba. Hyn¤¤k Gunurtm ¤ 2 /Gnang¤¤ng Ha H¤rn¤ BM Tnur ¤. >. > BUSV DECKTEGHBOT ·> H ·> NEW AUTOMATED SMARTBOT STORE SELLING / BUVING ALL CARDS 7 Mytnuns /Raras /Unn¤mm¤ns /G¤mm¤ns /B¤¤s¢ers 7 SAVE Up Tu 1 B% OEE 7 G SOM/ROE/ZZIN/M 11 Mwnutaurm ¤ 2[
Output from my homemade character recogition program after training alphabet N. Tra. . dng, ying. an. d. S. eling. SOM, 1 ROE, E. NW, CON. . , RBd. Ma. . ny. MTHO. ILcadsg. . oodprces. . We. co Ho. 'rnl 0. h, . . 'o'r BUS. DECK. TC. H. B. O. TN. E. . . WA. UTOM. . A. TE. D. S. MA. BO. TSTO. . RE. S. E. . LL. N. GBUYI. . . NG. . A. L. L. C. R. DSM. ythcs. /Ra. es/Uc. om. . m. on. /C. o. mm. o. . n. s/Bosters. S. AVEUp. To. 10 OFC. a. . rd. Bot, S. M. A. R. Gla. . . ts. . . 5. . ecst. Bror. Ra 0 Vitw. M. no. ta. u. 0. 02. B. . U. . . YI. . N. . G. . . Prime. . va. . . lita. . . n 35 Ven. g. . . e. . . v
Worth mentioning: Classifieds section can produce 50+ screenshots worth of ads on one run down the scrollbar. It takes my Java program 5 to 10 minutes to process 1 screenshot. Designing a way to discover letters was my greatest programming feat on this project. Recursive growing function. Stops when no more black pixels are reachable. Returns left, upper most X, Y coordinate, and the width and height of the region. This allowed me to save off sub images of the over screenshot into little letters. I created a Hash of the checksum value for each little image, and wrote a config file with all known checksum values and their corresponding values.
End
36af2bccd3b63fff0f0f783f47d6ec56.ppt