Скачать презентацию An Introduction to Parrot Dan Sugalski dan sidhe org Скачать презентацию An Introduction to Parrot Dan Sugalski dan sidhe org

ffb117d8787439113413553a8b982131.ppt

  • Количество слайдов: 92

An Introduction to Parrot Dan Sugalski dan@sidhe. org January 28, 2004 An Introduction to Parrot Dan Sugalski dan@sidhe. org January 28, 2004

Overview What’s it all about Overview What’s it all about

Purpose Optimized for Dynamic Languages Perl 5, Python, Ruby specifically Run really, really fast Purpose Optimized for Dynamic Languages Perl 5, Python, Ruby specifically Run really, really fast Or at least as fast as reasonable under the circumstances • Easily extendable • Easily embeddable • Play Zork • •

History How we got where we are History How we got where we are

OSCON 2000 • Infamous mug pitching incident • Perl 6 started • Language and OSCON 2000 • Infamous mug pitching incident • Perl 6 started • Language and software developed separately

Perl 6 -- not too much bigger • • • That hasn’t lasted Allison’s Perl 6 -- not too much bigger • • • That hasn’t lasted Allison’s talking about that one The start was smallish, though Fix the annoyances Amazing how many things turned out to be annoying

Big language umbrella • Not much semantic difference between Perl 5, Python, and Ruby Big language umbrella • Not much semantic difference between Perl 5, Python, and Ruby • Perl 6 was obviously going to borg them and a bit more • Even ML and Haskell haven’t been safe • More concepts have gone in as time has progressed

Parrot went for them all • • Yeah, we were getting bored Had to Parrot went for them all • • Yeah, we were getting bored Had to do something We liked Ruby and even Python We hated having multiple interpreters around

Parrot and the Parrot Prank • • • 2001 April Fools Joke Perpetrated by Parrot and the Parrot Prank • • • 2001 April Fools Joke Perpetrated by Simon Cozens Parrot -- New language Perl & Python Amalgam Pretty funny as these things go

Timeline • The project came first • Then, the Parrot Joke • We grabbed Timeline • The project came first • Then, the Parrot Joke • We grabbed the name

Non-Purpose • Don’t care about non-dynamic languages • Not much, at least • Other Non-Purpose • Don’t care about non-dynamic languages • Not much, at least • Other people can worry • Engineering tradeoffs favor dynamic languages

True language neutrality is impossible • • Vicious sham All engines have a bias True language neutrality is impossible • • Vicious sham All engines have a bias Even the hardware ones Processors these days really like C

Architecture How it’s supposed to look Architecture How it’s supposed to look

Buzzwords • Register based, object-oriented, language agnostic, threaded, eventdriven, async I/O capable virtual machine Buzzwords • Register based, object-oriented, language agnostic, threaded, eventdriven, async I/O capable virtual machine • No, really

Software goals • • • Fast Safe Extendable Embeddable Maintainable Software goals • • • Fast Safe Extendable Embeddable Maintainable

Administrative goals • Resource Efficient • Controllable • Not suck when used as an Administrative goals • Resource Efficient • Controllable • Not suck when used as an apache module • Cautious about whole-system impact

Driving assumptions • • • C function calls are inexpensive L 1 & L Driving assumptions • • • C function calls are inexpensive L 1 & L 2 caches are large Memory bandwidth is limited CPU pipeline flushes are expensive Interpreter must be fast JIT a bonus, not a given

Interpreter Core in Pictures Frame Stack User Stack String registers Integer registers Interpreter Core Interpreter Core in Pictures Frame Stack User Stack String registers Integer registers Interpreter Core Lexicals Float registers Frame Stack Globals PMC registers Control Stack Frame Stack

Parser • Source goes in, AST comes out • Built in part on perl Parser • Source goes in, AST comes out • Built in part on perl 6 rules engine • Pluggable parser architecture

Compile and optimize (IMCC) • Turns the output of the parser into executable code Compile and optimize (IMCC) • Turns the output of the parser into executable code • Optional optimizing step • Register coloring algorithms provided here

Execution • • Interpreter JIT C code Native executables Execution • • Interpreter JIT C code Native executables

Base Engine • • • Bytecode driven Platform-neutral bytecode Register-based system Stacks Continuation-passing style Base Engine • • • Bytecode driven Platform-neutral bytecode Register-based system Stacks Continuation-passing style

Bytecode • Directly executable • Resembles native executable format • • Code Constants Metadata Bytecode • Directly executable • Resembles native executable format • • Code Constants Metadata No BSS, though

Designed for efficiency • Directly executable • mmap()ped in • Only complex constants (strings, Designed for efficiency • Directly executable • mmap()ped in • Only complex constants (strings, PMCs) need fixup • Converts on size and/or endian mismatch

Platform Neutrality • • If native format, used directly Otherwise endian-swapped Off-line utlity to Platform Neutrality • • If native format, used directly Otherwise endian-swapped Off-line utlity to convert Only difference is speed hit on startup

Registers • All operations revolve around VM registers • Essentially CPU registers • Four Registers • All operations revolve around VM registers • Essentially CPU registers • Four types • • Integer Float String PMC • 32 of each

Registers • Parrot’s one RISC concession • Non-load/store must operate on registers or constants Registers • Parrot’s one RISC concession • Non-load/store must operate on registers or constants • JIT maps VM registers to platform registers if there are some • Otherwise pure (and absolute) memory addressing to VM registers

Stacks • Six stacks • One general purpose typed stack • Four register backing Stacks • Six stacks • One general purpose typed stack • Four register backing stacks • Push/pop half register frames in one go • Faster than push/pop of frames to general stack • One control stack

Stacks • Bit of a misnomer • Really tree of stack frames • Confusing, Stacks • Bit of a misnomer • Really tree of stack frames • Confusing, though

Continuation Passing Style • Used for calling conventions • Parrot makes heavy use of Continuation Passing Style • Used for calling conventions • Parrot makes heavy use of continuations • If you don’t know they’re there you’ll not care • All Ruby’s fault, really • Hidden from HLL code

Parrot’s data Where the magic lives Parrot’s data Where the magic lives

Data isn’t passive • Lots of functionality hidden in data • Partly OO • Data isn’t passive • Lots of functionality hidden in data • Partly OO • Or as OO as you get in C

Strings • Language neutral • Encapsulate language behavior, encoding, and character set • Annoyingly Strings • Language neutral • Encapsulate language behavior, encoding, and character set • Annoyingly complex

Basic String Diagram Buffer Info Encoding Charset Language Flags Basic String Diagram Buffer Info Encoding Charset Language Flags

Encoding • Represents how the bits are turned into ‘characters’ • Code points, really Encoding • Represents how the bits are turned into ‘characters’ • Code points, really • Even for non-unicode encodings • Handles transformations from/to storage

Character Set • Which characters the code points represent • Basic character manipulation happens Character Set • Which characters the code points represent • Basic character manipulation happens here • Case mangling, substrings • Transformations to other character sets

Language • Nuances of sorting and case mangling • Interpretation of most asian text Language • Nuances of sorting and case mangling • Interpretation of most asian text when using Unicode • Ignorable if you don’t care

Unicode • • Parrot does Unicode Used as pivot encoding/charset IBM’s ICU library Didn’t Unicode • • Parrot does Unicode Used as pivot encoding/charset IBM’s ICU library Didn’t want to write another badly done unicode library

Efficiency concerns • Multiple encodings/charsets means less conversion • Transform data only when needed Efficiency concerns • Multiple encodings/charsets means less conversion • Transform data only when needed • Strings are mutable • COW system for space/speed efficiency

The PMC • Represents a HLL variable • Language agnostic • Everything pivots off The PMC • Represents a HLL variable • Language agnostic • Everything pivots off PMCs

PMC diagram Vtable Flags Cache Data Pointer Metadata GC handle Synchronization PMC diagram Vtable Flags Cache Data Pointer Metadata GC handle Synchronization

The Vtable • • How all the functionality is implemented Almost everything defers to The Vtable • • How all the functionality is implemented Almost everything defers to PMCs Large part of interpreter logic in PMCs Allows fast operator overloading and tying

Some vtable operations • • • Addition Subtraction Multiplication Division Bitwise operations • • Some vtable operations • • • Addition Subtraction Multiplication Division Bitwise operations • • Loading Storing Comparison Truth Type conversion Logical operations Finalization

Vtable functions may be Parrot • How languages implement user operator overloading • Used Vtable functions may be Parrot • How languages implement user operator overloading • Used for perl-style tying • Usable for operator wrapping

PMCs are typed • Types can change • Allows customized behavior • Cuts out PMCs are typed • Types can change • Allows customized behavior • Cuts out some overhead

All PMCs indexable • • As array or hash Operations may be delegated PMC All PMCs indexable • • As array or hash Operations may be delegated PMC may be both hash and array Scalar as well

Multimethod dispatch • • Core interpreter functionality Used for many PMC operations Beats hand-rolling Multimethod dispatch • • Core interpreter functionality Used for many PMC operations Beats hand-rolling it Dispatch surprisingly fast

Magic all hidden • User code never knows about magic • Allows transparent behaviour Magic all hidden • User code never knows about magic • Allows transparent behaviour changes • One big pivot point for dispatch

Objects • Standard but optional object system • Standard object protocols • Standard object Objects • Standard but optional object system • Standard object protocols • Standard object opcodes

Everything can be an object • Objects have attributes • Objects can have methods Everything can be an object • Objects have attributes • Objects can have methods call on them • All PMCs have get/set attribute vtable entries • All PMCs have a method call entry • Therefore, all PMCs are objects

Objects are cross-language • Obey the protocols and use the facilities and you’re fine Objects are cross-language • Obey the protocols and use the facilities and you’re fine • Can even inherit across object systems • Parrot will enforce some invariance

Object system optional • Okay to roll your own • Don’t have to interoperate Object system optional • Okay to roll your own • Don’t have to interoperate • Load up your own ops and go for it

Base support for objects • • Scoped method caches Selective cache invalidation Signature based Base support for objects • • Scoped method caches Selective cache invalidation Signature based dispatch in core Op support • • Property and attribute access Method call Subclassing can, is, and does

Assembly Language Because hand-generating bytecode is annoying Assembly Language Because hand-generating bytecode is annoying

Sample set N 0, 10 set N 1, 0 loop: print Sample set N 0, 10 set N 1, 0 loop: print "Hello, world!n" add N 1, 1 # Could be “inc N 1” ne N 0, N 1, loop end

Straightforward • Destination, source add DEST, SOURCE 1, SOURCE 2 • VAX is not Straightforward • Destination, source add DEST, SOURCE 1, SOURCE 2 • VAX is not dead • Some magic during assembly

Ops pre-exploded • • • No actual add op add_i_I_ic, add_i_I_i, add_p_i_i Etc… Assembler Ops pre-exploded • • • No actual add op add_i_I_ic, add_i_I_i, add_p_i_i Etc… Assembler chooses right op No runtime type checking needed No runtime JIT code analysis needed

Ops pre-exploded • Little extra code needed • Ops source has custom macro preprocessor Ops pre-exploded • Little extra code needed • Ops source has custom macro preprocessor • Reduces maintenance load

Add example inline op add(out INT, in INT) { $1 = $2 + $3; Add example inline op add(out INT, in INT) { $1 = $2 + $3; goto NEXT(); }

Add example opcode_t * Parrot_add_i_i_i (opcode_t *cur_opcode, struct Parrot_Interp * interpreter) { IREG(1) = Add example opcode_t * Parrot_add_i_i_i (opcode_t *cur_opcode, struct Parrot_Interp * interpreter) { IREG(1) = IREG(2) + IREG(3); return (opcode_t *)cur_opcode + 4; } Parrot_add_i_ic_i (opcode_t *cur_opcode, struct Parrot_Interp * interpreter) { IREG(1) = cur_opcode[2] + IREG(3); return (opcode_t *)cur_opcode + 4; }

Very CISCy • • • I like assembly Wanted it to be easily targeted Very CISCy • • • I like assembly Wanted it to be easily targeted Wanted to be easy to hand-write Good fit to compiler output CISC fits interpreters better

Rich instruction set • • Side-effect of interoperability Nifty side effects Very fast dispatch Rich instruction set • • Side-effect of interoperability Nifty side effects Very fast dispatch Much lower JIT overhead

Extensible instruction set • • • Loadable on demand Provides fast access to code Extensible instruction set • • • Loadable on demand Provides fast access to code Allows language-specific opcodes Even writable in parrot bytecode Blurs opcode/function/method lines

PIR • Parrot Intermediate Language • Slightly higher level than assembly • Runs through PIR • Parrot Intermediate Language • Slightly higher level than assembly • Runs through the optimizer

PIR • Assembly without the annoyances • Infinite number of registers • Function header PIR • Assembly without the annoyances • Infinite number of registers • Function header and parameter setup

Sample (Same as assembly) $N 0 = 10 $N 1 = 0 Loop: print Sample (Same as assembly) $N 0 = 10 $N 1 = 0 Loop: print "Hello, world!n" N 1 = N 1 + 1 ne N 0, N 1, Loop end

Assembly++ • • • Locals Register allocation and coloring Automatic sub creation Simple expressions Assembly++ • • • Locals Register allocation and coloring Automatic sub creation Simple expressions Calling-convention aware

PIR Example 2. sub _MAIN prototyped. param pmc argv. local int count = argv[0] PIR Example 2. sub _MAIN prototyped. param pmc argv. local int count = argv[0] _printme(count) end . sub _printme prototyped. param int Max. local int Current = 0 Loop: print "Hello, worldn” inc Current ne Current, Max, Loop. pcc_begin_return 1. pcc_end_return. end

Register allocation • • Infinite number of temps Lifetimes are traced and managed Automatic Register allocation • • Infinite number of temps Lifetimes are traced and managed Automatic spilling Single nastiest register task

Toys and Tools It’s Alive! Toys and Tools It’s Alive!

Demos • • Ncurses demo Parrot Basic demo Parrot CGI demo Real Work demo Demos • • Ncurses demo Parrot Basic demo Parrot CGI demo Real Work demo

Functioning Languages • The gag languages • Befunge • BF • Ook! Functioning Languages • The gag languages • Befunge • BF • Ook!

Functioning Languages • The real languages • • Forth BASIC Scheme Decision. Plus Functioning Languages • The real languages • • Forth BASIC Scheme Decision. Plus

Functioning Languages • The unfinished languages • • Perl 5 Perl 6 Python Ruby Functioning Languages • The unfinished languages • • Perl 5 Perl 6 Python Ruby

Security Because sometimes people just suck Security Because sometimes people just suck

Security requirements: Handle • Untrustworthy code • Malicious code • Badly written code Security requirements: Handle • Untrustworthy code • Malicious code • Badly written code

Protection Categories • • Resource usage Access Mistrusted bytecode Isolated Interpreters Protection Categories • • Resource usage Access Mistrusted bytecode Isolated Interpreters

Resource usage • Memory, CPU, IO, and time quotas • Individually settable • May Resource usage • Memory, CPU, IO, and time quotas • Individually settable • May be enabled and disabled on the fly with sufficient privilege

Access • Restrictions on what code can do • Introduces a VMS-style privilege system Access • Restrictions on what code can do • Introduces a VMS-style privilege system • Areas of higher and lower privilege

Mistrusted bytecode • • Assumes malformed bytecode Verifies all arguments Verifies jump destinations Much Mistrusted bytecode • • Assumes malformed bytecode Verifies all arguments Verifies jump destinations Much slower

Isolated interpreters • Can run code in a separate interpreter • Controlled environment Isolated interpreters • Can run code in a separate interpreter • Controlled environment

Quickies Putting a limit on boredom Quickies Putting a limit on boredom

Events • Async event system built in • One shared, integrated event loop • Events • Async event system built in • One shared, integrated event loop • Everything can use it

IO • • All IO asynchronous Synchronous wrappers provided Integrated with event system Under-the-hood IO • • All IO asynchronous Synchronous wrappers provided Integrated with event system Under-the-hood thread games where needed

Threads • Designed to be threaded from the ground up • Not the POSIX Threads • Designed to be threaded from the ground up • Not the POSIX thread model, alas • Interpreters too heavy-weight • No guarantees of user safety, just interpreter safety

Parrot Development Always ongoing Parrot Development Always ongoing

Getting and installing Parrot • Point releases • Whenever “Big Things” get done • Getting and installing Parrot • Point releases • Whenever “Big Things” get done • Get good workout b efore release • Snapshots • Three times a day • For folks without easy CVS access • http: //cvs. perl. org/snapshots/parrot/

Getting and installing Parrot • CVS • Full anon access • : pserver: anonymous@cvs. Getting and installing Parrot • CVS • Full anon access • : pserver: anonymous@cvs. perl. org: /cvs/public • Rsync • From latest CVS tree • rsync -av --delete cvs. perl. org: : parrot-HEAD parrot

Builds on • Many Unices • • • Linux Mac OS X *BSD Solaris Builds on • Many Unices • • • Linux Mac OS X *BSD Solaris AIX • Win. XP • Visual Studio • Cygwin

Regular automated testing • Tinderbox system • Regular checkout, build, and testing • http: Regular automated testing • Tinderbox system • Regular checkout, build, and testing • http: //tinderbox. perl. org/tinderbox/bdsho wbuild. cgi? tree=parrot

Parrot Mailing lists • Parrot-internals@perl. org • Was perl 6 -internals • Most of Parrot Mailing lists • Parrot-internals@perl. org • Was perl 6 -internals • Most of the action • Parrot-compilers@perl. org • @parrotcode. org soon, hopefully

Questions? ? Questions? ?