dd5e424435dfaec516e69347c9eddd1f.ppt
- Количество слайдов: 35
Sun Solaris OS Glenn Barney gb 2174@columbia. edu COMS E 6998. 002 : Advanced Computer Design
Metrics • Sun focused on 5 major design areas - Performance - Security - Prevent - Detect - Respond - Availability - Utilization - Platform Choice - Hardware Compatibility list - 716 x 86/x 64 systems, 75 SPARC systems • Major Metric successes are Security, Availability. Performance and Utilization are a bit more questionable… but still very good as we’ll see.
History of Solaris • It’s a Unix OS that is an amalgam of earlier Unix based OSs, but mainly SUN’s first OS, Sun. OS based on BSD and AT&T’s Unix, System. • The General timeline : – 1970 to 1979 : Unix is first written and Assembly and then C by Ritchie and Thompson. – 1982 – Bill Joy leaves Berkeley, co-founds sun and develops Sun. OS based on BSD – 1984 to 1987 – AT&T develops releases System V, which competes with BSD until the mid 90 s – 1988 – AT&T Purchases large stake in Sun – 1993 – Sun announces first version of Solaris, which will no longer be based on BSD but mainly on System V release 4, an mix of other Unix distributions. The competing unix standards group, OSF, begins a GUI war with Sun supporting it’s own MOTIF/X against Sun’s OPEN LOOK. – 1994 – Sun creates the Common Desktop Environment to support both MOTIF and OPEN LOOK - by Solaris 5 it’s officially supported
The Solaris Gestalt • Pulled in from System V • Pulled in from BSD – Virtual memory system – Fast file system with symbolic links – TCP/IP networking system with Kerberos, Telnet, FTP, sendmail. – Alternate shells to Bourne shell (C shell) – Vendor products like NFS (from SUN) – and symmetric multiprocessing support, thread management and shared libraries – Interprocess Communication – Bourne shell enhancements – STREAMS and TLI networking libraries – Remote File Sharing – Improved memory paging – Application Binary Interface Created by Sun for the Sun. OS • Sun. OS 4. x –NFS –Open. Windows 2. 0 GUI –Open. Boot monitor –Desk. Set Utilities –Multiprocessing Support • Sun. OS 5. x (ie Solaris) –SMP for more then 100 processors in single server –CDE (Motif, Post. Script, Open Look) –Gnome 2. 0 to support Linux integration –Network Information Service (NIS) –Clustering –Java –Ever growing list of new features
Some General Solaris Tidbits • Solaris 10 does not support old Sun hardware : Chipsets it does support Ultra. SPARC II, IV and newer, 32 bit Intel x 86 and 64 -bit AMD Opteron. • Of course old 32 bit SPARC programs are still supported • Sun does support batch jobs like JCL : Sun MBM - which preserves Batch step constructs on Sun systems • Load balancing seems to require a third party application • Sun Network Cache and Accelerator (SNCA) since Solaris 8 helps cache and serve web pages, but doesn’t do load balancing per se
Solaris Overview • • Processor/Platform Specific code – less then 5% of kernel, developed to adapt to different hardware platforms Device Drivers - dynamically loaded and use a common published interface File System and Volume Management – treat large number of disks as single volume, Virtual File System supports unlimted file system extensions : UFS, NFS, Sun Store. Edge file systems, PC file systems, etc. New Zetabyte File System. Unified TCP/IP Stack • Linux System Call Handler is in-kernel, it catches Linux ssytem calls and dispatches the equivalent Solaris kernel functions • Dtrace debugging system new for Solaris 10, clean and modular pre-deployed global debugging solution at minimum runtime cost.
Solaris Modular Kernel • Seven types of loadable modules • Secluding classes • File systems • Loadable system calls • Loaders for executable file formats • Stream modules • Bus or device drivers • Miscellaneous
• • • Solaris Kernel Thread - core unit of execution that is scheduled and executed on a processor. – have an execution state and context that includes a global priority and scheduling class – units that get scheduled, executed and context switched on and off processors User Thread – user level thread state maintained within a user process Process – executable form of program Lightweight Process – LWP kernel visible execution context for a user thread Solaris 2 to 8 had a “two-level threads model” where many threads were able to be assigned to to a smaller group of LWPs. However the two-level model was replaced with a 1 to 1 model. Why? Basically it was too complicated. • Improved performance, scalability, and reliability • Reliable signal behavior • Improved adaptive mutex lock implementation • User-level sleep queues for synchronization objects
Kernel Thread Scheduling • • Dispatcher uses priority model to select which kernel thread to execute next. Supports preemption, and the kernel itself is preemptable. 170 global priorities partitioned by scheduling class. Three main classes are TS, SYS, and RT. Timeshare (TS) – default for all process and kernel threads in the process. Interactive (IA) – enhanced TS used by the windowing system to boost threads under the window focus Fair Share Scheduling (FSS) – share based, not priority based. Fixed Priority (FX) – fixed-priority System (SYS) – used for kernel threads, they are bound and run till block or complete Real Time (RT) – fixed priority, fixedtime quantum scheduling.
Interprocess Communication and Signals • Traditional Unix IPC – Pipes: directly channels data between related processes through an file like object – Named Pipes – FIFO paipes actually implemented as files in the file system namespace – Socket – can be over a network or local (domain) • System V IPC – Shared Memory – process create segment of shared memory shared among each other – Message Queue – each message contains a 32 bit type value and a data payload – Semaphores – process can sleep on them, used for synchronization but any process can increment • Solaris doors – Door server contains a thread that sleeps waiting for client, client calls server through a door and scheduling control is passed to the door to the requesting thread through the door server. Very low latency turnaround. • Signals – can interrupt a process after an event occurs. Signals can be ignored, caught and handled, or treated with a default action.
Memory • 64 -bit kernel and process address space • optimizes memory use by sharing program binaries and application data among processes • VM system manages most objects related to I/O and memory, kernel and user applications, shared libraries and file systems – Manages virtual-to-physical mapping of memory. – Manages swapping memory between primary and secondary storage to optimize performance. – Handles requests of shared images between multiple users and processes. – It acts as an integrated file cache. • Newer features in the VM implementation include : – During I/O uses 64 bit address space to create a permanent mapping of all physical pages into SEGKPM, eliminating need to map/unmap for each I/O. – Variable page sizes, largest available now is 356 Bytes – Generic framework: Multiple Page Size Selection (MPSS) for various page sizes – Support for nonuniform (NUMA) memory architectures – Dynamic reconfiguration – new pages can add to the free list on the fly while the kernel is in a safe “kernel cage” – Modern memory allocators support slabs
Virtual Memory • • • Pages can very in size, common size is 8 Kbytes. Solaris kernel uses a combined demandpaged and swapping model. Abstract memory objects called segments, vnodes, and pages – – Physical memory, in chunks called pages Virtual file object called vnode File system is a hierarchy of vnodes Process and Kernel address space as segements of mapped vnodes – Mapped hardware devices (ie frame buffers) are segments of hardware-mapped pages • Physical Memory management done by Hardware Address Translator (HAT) – Machine independent implementation
Virtual Memory Continued • • • Process’s virtual address space skeleton created by kernel when the fork() system call creates the process Memory is allocated on the heap, malloc() doesn’t create physical memoy Heap can be allocated in 32 or 64 bit mode, much larger with 64 bit mode. Picture on the right show memory mapping can share data among processes Several options govern how a file is shared when it is mapped between process – MAP_SHARED can be set to PROT_, READ|PROT_, WRITE – MAP_PRIVATE can be set to PROT_, READ|PROT_, WRITE • Each segment has protection mode Read, Write, or Executable.
Page Faults and Anonymous Memory • • • Major Page fault occurs when physical page does not exist Minor page fault when page is in physical memory but no MMU translation is exists (attaches) Protection fault when access violates memory permissions There can also be anonymous memory, pages that are not associated with a vnode. They are used for new heap space, and are allocated by a zero-fill-on-demand operation, or a ZFOD.
Intimate Shared Memory • System V shared memory (ipc) option • Shared Memory optimization: – Additionally share low-level kernel data – Reduce redundant mapping info (V-to-P) • Shared Memory is locked, never paged – No swap space is allocated • Use SHM_SHARE_MMU flag in shmat()
Physical Memory • • Memory managed by page scanner deamon (except kernel memory) When the system is booted memory is placed on the freelist in page size chunks. Anonymous memory is used for most of a processes’s memory allcoation (heap and stack). Pages are read into memory from the free list and then reside in a segmap cache, process’s address space, or the cachelist. page_create_va() allocates pages, taking into account the virual address to calculate page coloring. Page scanner uses global page replacement. Two bits are kept per page to indicate if the page has been modified since bits were last cleared.
Page swapping “ two-handed clock algorithm” • In addition to this page-out process, the dispatcher can swap out entire processes to conserve memory, it does this rarely but in extreme circumstances.
Slab and HAT • Solaris has a general purpose memory allocator known as the slab allocator. Used for memory requests that are : – Smaller then a page size – Not even a multiple of a page size – Frequently going to allocated and freed memory that causes fragmentation • Solves fragmentation issues by grouping different-sized memory objects into separate caches, where each object cache has it’s own size and characteristics • The HAT layer programs’s the TLB with entries identifiying the relationship of the virutal and physical addresses. If the TLB lookup fails, as backup the Ultra. SPARC uses a translation storage buffer (TSB), while most other architectures use a hardware page table. Big difference cause the TSB is a software lookup, but Solaris provides both. Take a look at the slide titled “Virtual Memory” to see a picture of the HAT layer, it is on the right • • •
Virtual File System VFS • • Created to abstract away file systems so NFS and UFS could co-exist Made of vnode, the virtual node interface that implements file-related functions, and vfs the virtual file system that directs functions to specific file systems Structures consist of file descriptors in a file list, which point to a per-process file table. A vnode is looked up in this table, which eventually points to a physical node depending on file system implementation. New in Solaris 10 : Zettabyte File System – Endian Neutral – move files between SPARC and x 86 based systems – ZFS protects all data with 64 -bit checksums – 128 -bit file system! – built on top of virtual storage pools – All operations are transactional and copy-on -write
Unix File System (UFS) • • • UFS we know and love : The default file system for Solaris, in development for over 20 years. Based around disk geometry : the number of sectors in a track, the location of the head, and the number of tracks. Supports hard and soft links. Inode (index node) is the internal descriptor for a file Access scheme : users, group, world.
I/O • Two distinct methods perform file system I/O: – read(), write(), and related system calls – Memory-mapping of a file into the process's address space • Both are in the picture here to the right.
Performance: NUMA systems • • Non. Uniform Memory Access (NUMA) machine - machines in which some memory is closer to some CPUs than others Addressed by the Memory Placement Optimzation framework (MPO) – Locality awareness – Balancing – Dynamic topology support • • • Latency groups (lgroup) – sets of CPU and equidistant memory defined in the kernel. A home lgroup is chosen for each thread upon creation, and it prefers this lgroup. For memory allocation, perfer lgroup but if you know you have multithreaded, spread out code, random placement may be better
CMT support and Parallel System architectures • • Chip Multithreading (CMT) CPUs share various processor components and caches The three different parallel architectures – SMP. Symmetric multiprocessor with a shared memory model; single kernel image – MPP. Message-based model; multiple kernel images – NUMA/cc. NUMA. Shared memory model; single kernel image So the Solaris kernel has several semaphores and mutex locks to help address concurrent thread memory access. SMP (like Intel and AMD chips) and CMT (the Ultra. SPARC T 1) is lot more complicated then just NUMA system, and much research goes on in this field. Sun’s attitude is to try to make things as simple as possible while still providing necessary synchronization.
Networking : The TCP/IP Stack • • Was two STREAMS layers with packet queueing and locks between layers and 1 processor thread per connection Now merges TCP and IP layers and allocates a single thread per CPU. – Streamlined to process packet through both layers – Binds connections to a CPU for entire life • Uses a vertical perimeter per-CPU mehcnaism to protect the connection. It is implemented with an IP classifier, serialization queue, and worker thread so only one CPU processes a specific packet. • Integrated support for TCP offload engines – let hardware do the work
Security • For user permissions – UFS and file system permissions – Role Based Access Control since Solaris 8 – New in Solaris 10: least privilege model – Access Control Lists let you make arbitrary security permissions • Kernel level permissions, the privileged kernel thread and modules run the whole system and control Solaris containers. • Automated Patch Tool • Solaris Cryptographic framework • Full network traffic control, for example TCP packet monitoring, disable redirecting of packets and answering system pings.
Solaris Containers/Zones • Containers provide the complete virtualized environment, zones are the component that provides the isolation between zones. • Up to 8192 virtualized environments per Solaris OS instance. • Provides a secure sandbox that has unique root, user and file systems. Also network interfaces, devices, hardware, I/O all virtualized. • The kernel makes sure that the zones are isolated. • If a zone fails, it can reboot in a few seconds.
Process rights management • Solaris 10 OS least privilege model includes nearly 50 fine-grained privileges as well as the basic privilege set. – Evolved from Trusted Solaris. – Basic Privilege set includes al privileges given to unprivileged processes in the tradition security model • Each process has four sets in it’s kernel credentials – – • The Inheritable set (I): The privileges inherited on exec. The Permitted set (P): The maximum set of privileges for the process. The Effective set (E): The privileges currently in effect, a subset of P. The Limit set (L): The upper bound of the privileges a process and its children may obtain Once launched, a process uses privalege manipulation functions to add or remove privaleges from the privilege sets
Cryptography Two Basic Types • User level Framework • Exists Outside the Kernel • Uses the PKCS 11 interface • Applications use it • Kernel Level Framework • Operating System modules use it • Can interface with hardware and software plug-ins Niether provide actual encryption algorithms, plug-ins do all the work! Both are verified by the Module Verification Deamon
Cryptography Continued • Each plug in must be verified (signed) by the Module Verification Daemon – First sets up thread pull that lives in the k. CF to service requests – Second answers request for verification of user and kernel level provider signatures User level crypto algorithms supported • • • Kernel level crypto algorithms supported Cryptoadm() tool provided for administration of u. CF and k. CF. /dev/crypto drivers allow communication between user and kernel level plug ins /dev/cryptoadm runs the Module Verificaton Daemon For user level, provides digest() and mac() for calculating digest and MAC of files. Provides encrypt() and dectrypt() for encrypting and decrypting files Solaris IPsec/IKE and Kerberos, user-level and kernel-level, have been ported to use the Solaris Cryptographic Framework in the Solaris 10 OS.
DTrace Debugging System • • • Dynamically record data at points of interest (probes) in the user and kernel areas. Record stack trace, timestamp, arguments. Kernel modules called providers know how to activate probes Has it’s own D language – a compiler looks for probes and providers, using the provider information to find which probes should be logged when fired. DTrace won the top prize in the Wall Street Journal's 2006 Technology Innovation Awards competition 30, 000 published probes within the Solaris kernel
Recovery – Predictive self healing • • • Self diagnosing system is constantly gathering data. Error reports are encoded as a set of name-value pairs and form an error event. Diagnosis engines run in the background consuming error events. Diagnosis engines output a fault event, broadcast to all agents who can respond. Enter the Solaris Fault Manager – – • • Manages the diagnosis engines and agents Provides a programming model for clients Compiles logs Manages multiplexing of events between producers and consumers Sun message identifier corresponds an error message with an online knowledgebase article or link Diagnosis have a universal link identifier so that solutions can be cross referenced
Why Solaris beats Linux • • Solaris is more secure - it has. ACLs, RBAC, PRM, and containers vs. ACLs and Xen in Linux Solaris is more Sable – Linux has rapid change and multiple centers of control. While sun has a predictable lifecycle, and Solaris Application Guarantee. Solaris has a better price/performance : SPECj. App. Server 2002 results Solaris has a lower cost of support for high level support
Why Linux Beats Solaris • Novell points out Solaris’s higher cost for multiple • CPU machines Novell points out Solaris’s poor performance But Sun has put out a lot of technology to fight criticisms, like ZFS to address big endian/little endian compatibility between SPARC and x 86, and the linux binary API to increase software options on Solaris.
Where Solaris is Headed • • • Since once the most popular UNIX based OS in the world, SUN has lost a lot of market share. – Microsoft Windows took the low-end market away from most Unix systems – Linux came in to pull away remainder – Solaris left with the high-end space - based sales on its stability, performance, and support Now with Solaris 10 and Open. Solaris, sun is trying to regain the low end market Trying to work with AMD/Linux, not against it: – Linux Application Environment – Specific designs for AMD multiprocessor systems – Free OS with competitive support options • Trusted Solaris features in Solaris 10 a huge selling point
References • • • Solaris 10: In a Class By Itself http: //www. sun. com/software/whitepapers/solaris 10/classbyitself. pdf Solaris and Linux : Seal. Rock research comparison whitepaper http: //www. sun. com/software/whitepapers/solaris 10/sealrock. pdf Solaris 10 The Complete Reference – http: //books. mcgrawhill. com/downloads/products/0072229985_ch 01. pdf Solaris 8 Administrator Certification Training Guide – Appendix C http: //unixed. com/Resources/history_of_solaris. pdf Solaris™ Internals Core Kernel Components http: //www. phptr. com/content/images/0130224960/samplechapter/0130224960. pdf Solaris™ Internals : Solaris 10 and Open. Solaris Kernel Architecture http: //www. sun. com/books/catalog/solaris_internals. xml The Solaris Cryptographic Framework http: //www. sun. com/bigadmin/features/articles/crypt_framework. pdf The least privilege model in the Solaris OS http: //www. sun. com/bigadmin/features/articles/least_privilege. html Solaris and Linux Seal Rock Research Paper http: //www. novell. com/collateral/4621445. pdf SUSE® Linux Enterprise Server 9 and Solaris 10 on x 86 http: //www. novell. com/collateral/4621445. pdf
dd5e424435dfaec516e69347c9eddd1f.ppt