58a120151730549a9081c4de60a5d001.ppt
- Количество слайдов: 82
INF 5070 – Media Storage and Distribution Systems: Server Resources 12/9 - 2005
Overview ü Resources, real-time, “continuous” media streams, … ü (CPU) Scheduling ü Memory management INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Resources and Real–Time
Resources ü Resource: “A resource is a system entity required by a task for manipulating data” [Steimetz & Narhstedt 95] ü Characteristics: Ø active: provides a service, e. g. , CPU, disk or network adapter Ø passive: system capabilities required by active resources, e. g. , memory Ø Ø exclusive: only one process at a time can use it, e. g. , CPU shared: can be used by several concurrent processed, e. g. , memory single: exists only once in the system, e. g. , loudspeaker multiple: several within a system, e. g. , CPUs in a multi-processor system INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Real–Time ü Real-time process: “A process which delivers the results of the processing in a given time-span” ü Real-time system: “A system in which the correctness of a computation depends not only on obtaining the result, but also upon providing the result on time” ü Many real-time applications, e. g. : Ø temperature control in a nuclear/chemical plant n n Ø defense system on a navy boat n n Ø driven by interrupts from an external device these interrupts occur irregularly control of a flight simulator n n Ø driven by interrupts from an external device these interrupts occur irregularly execution at periodic intervals scheduled by timer-services which the application requests from the OS . . . INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Real–Time ü Deadline: “A deadline represents the latest acceptable time for the presentation of the processing result” ü Hard deadlines: Ø must never be violated system failure Ø too late results n have no value, n means severe (catastrophic) system failure, e. g. , processing weather forecasts e. g. , processing of an incoming torpedo signal in a navy boat scenario ü Soft deadlines: Ø in some cases, the deadline might be missed n n Ø not too frequently not by much time result still may have some (but decreasing) value, e. g. , a late I-frame in MPEG INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Real–Time and Multimedia ü Multimedia systems Ø have periodic processing requirements (e. g. , each 33 ms in a 30 fps video) Ø require large bandwidths (e. g. , average of 3. 5 Mbps for DVD video only) Ø typically have soft deadlines (may miss a frame) Ø are non-critical (user may be annoyed, but …) Ø Ø Ø need predictability (guarantees) adapt real-time mechanisms to continuous media priority-based schemes are of special importance INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Admission and Reservation ü To prevent overload, admission may be performed: Ø schedulability test: n n n ð ð “are there enough resources available for a new stream? ” “can we find a schedule for the new task without disturbing the existing workload? ” a task is allowed if the utilization remains < 1 yes – allow new task, allocate/reserve resources no – reject ü Resource reservation is analogous to booking (asking for resources) Ø pessimistic n n n Ø optimistic n n n Ø avoid resource conflicts making worst-case reservations potentially under-utilized resources guaranteed Qo. S reserve according to average load high utilization overload may occur perfect n n must have detailed knowledge about resource requirements of all processes too expensive to make/takes much time INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Real–Time and Operating Systems ü The operating system manages local resources (CPU, memory, disk, network card, busses, . . . ) ü In a real-time, multimedia scenario, support is needed for: Ø real-time processing Ø efficient memory management ü This also means support for proper … Ø scheduling – high priorities for time-restrictive multimedia tasks Ø timer support – clock with fine granularity and event scheduling with high accuracy Ø kernel preemption – avoid long periods where low priority processes cannot be interrupted Ø memory replacement – prevent code for real-time programs from being paged out Ø fast switching – both interrupts and context switching should be fast Ø. . . INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Continuous Media Streams
Streaming Data ü Start playback at t 1 ü Consumed bytes (offset) t se Ø variable rate off ata Ø constant rate d ü Must start retrieving arrive function send function read function data earlier ü Data must arrive before consumption time ü Data must be sent before arrival time ü Data must be read from disk before sending time INF 5070 – media storage and distribution systems consume function time t 1 2005 Carsten Griwodz & Pål Halvorsen
Streaming Data ü Need buffers to hold data between the functions, e. g. , client B(t) = A(t) – C(t), i. e. , t : A(t) ≥ C(t) t ü Latest start of data arrival is given by fse of ata arrive function d min[B(t, t 0, t 1) ; t B(t, t 0, t 1) ≥ 0], i. e. , the buffer must at all times t have more data to consume function time t 0 t 1 INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Streaming Data ü “Continuous Media” and “continuous streams” are ILLUSIONS Ø retrieve data in blocks from disk Ø transfer blocks from file system to application Ø send packets to communication system Ø split packets into appropriate MTUs Ø . . . (intermediate nodes). . . (client) Ø different optimal sizes Ø file system communication system pseudo-parallel processes (run in time slices) Fneed for scheduling (to have timing and appropriate resource allocation) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
(CPU) Scheduling
Scheduling ü A task is a schedulable entity (a process/thread executing a job, e. g. , an packet through the communication system or a disk request through the file system) requests ü In a multi-tasking system, several tasks may wish to use a resource simultaneously scheduler ü A scheduler decides which task that may use the resource, i. e. , determines order by which requests are serviced, using a scheduling algorithm resource ü Each active (CPU, disk, NIC) resources needs a scheduler (passive resources are also “scheduled”, but in a slightly different way) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Scheduling ü Scheduling algorithm classification: Ø dynamic n n Ø static n n Ø make scheduling decisions at off-line (also called pre-run-time) generates a dispatching table for run-time dispatcher at compile time needs complete knowledge of task before compiling small run-time overhead preemptive n n Ø make scheduling decisions at run-time flexible to adapt considers only actual task requests and execution time parameters large run-time overhead finding a schedule currently executing task may be interrupted (preempted) by higher priority processes preempted process continues later at the same state potential frequent contexts switching (almost!? ) useless for disk and network cards non-preemptive n n n running tasks will be allowed to finish its time-slot (higher priority processes must wait) reasonable for short tasks like sending a packet (used by disk and network cards) less frequent switches INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Scheduling ü Preemption: Ø tasks waits for processing Ø scheduler assigns priorities Ø task with highest priority will be scheduled first Ø preempt current execution if a higher priority (more urgent) task arrives Ø requests scheduler preemption real-time and best effort priorities (real-time processes have higher priority - if exists, they will run) resource Ø to kinds of preemption: n preemption points o o n predictable overhead simplified scheduler accounting immediate preemption o o needed for hard real-time systems needs special timers and fast interrupt and context switch handling INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Scheduling ü Scheduling is difficult and takes time: RT process delay request round-robin process 1 process 2 process 3 process 4 … process N RT process request priority, non-preemtive delay process 1 process 2 process 3 process 4 … RT process N RT process priority, preemtive request only delay switching and interrupts process 1 process p 21 process 3 2 process 4 3…process 4 N p 1 RT process 2 process 3 process 4 … process N p process … INF 5070 – media storage and distribution systems process N 2005 Carsten Griwodz & Pål Halvorsen
Priorities and Multimedia ü Multimedia streams need predictable access to resources – high priorities, e. g. : 1. multimedia traffic with guaranteed Qo. S may not exist 2. multimedia traffic with predictive Qo. S 3. other requests must not starve ü Within each class one could have a second-level scheduler Ø Ø 1 and 2: real-time scheduling and fine grained priorities 3: may use traditional approaches as round-robin INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Scheduling in Windows 2000 ü Preemptive kernel ü Schedules threads individually ü Time slices given in quantums Ø 3 quantums = 1 clock interval (length of interval may vary) Ø defaults: n Win 2000 server: n Win 2000 workstation (professional) : 36 quantums Ø may manually be increased between threads (1 x, 2 x, 4 x, 6 x) Ø foreground quantum boost (add 0 x, 1 x, 2 x): active window can get longer time slices (assumed needs fast response) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Scheduling in Windows 2000 ü 32 priority levels: Round Robin (RR) within each level ü Interactive and throughput-oriented: Ø “Real time” – 16 system levels n n Ø Variable – 15 user levels n n n Ø fixed priority may run forever priority may change: thread priority = process priority ± 2 uses much drops user interactions, I/O completions increase Idle/zero-page thread – 1 system level n n runs whenever there are no other processes to run e. g. , clearing memory pages for memory manager INF 5070 – media storage and distribution systems Real Time (system thread) 31 30. . . 17 16 Variable (user thread) 15 14. . . 2 1 Idle (system thread) 0 2005 Carsten Griwodz & Pål Halvorsen
Scheduling in Linux ü ü ü Preemptive kernel Threads and processes used to be equal, but Linux uses (in 2. 6) thread scheduling Ø Ø 126 127 each priority in RR timeslices of 10 ms (quantums) ordinary user processes uses “nice”-values: 1≤ priority≤ 40 timeslices of 10 ms (quantums) SHED_RR 1 2 Ø realtime (FIFO and RR): goodness = 1000 + priority timesharing (OTHER): goodness = (quantum > 0 ? quantum + priority : 0) Quantums are reset when no ready process has quantums left (end of epoch): quantum = (quantum/2) + priority INF 5070 – media storage and distribution systems nice . . . Threads with highest goodness are selected first: Ø ü . . . may run forever, no timeslices may use it’s own scheduling algorithm SHED_OTHER Ø ü 2 SHED_RR Ø ü 1 SHED_FIFO Ø ü SHED_FIFO -20 126 -19 127 . . . 18 SHED_OTHER default (20) 19 2005 Carsten Griwodz & Pål Halvorsen
Scheduling in AIX ü Similar to Linux, but has SHED_FIFO 1 always only used thread scheduling Ø Ø Ø 2. . . SHED_FIFO SHED_RR SHED_OTHER ü BUT, SHED_OTHER may 126 127 SHED_RR 1 change “nice” values Ø Ø running long (whole timeslices) penalty – nice increase interrupted (e. g. , I/O) gives initial “nice” value back 2. . . 126 127 SHED_OTHER default INF 5070 – media storage and distribution systems nice -20 -19. . . 18 19 2005 Carsten Griwodz & Pål Halvorsen
Real–Time Scheduling ü Multimedia streams are usually periodic (fixed frame rates and audio sample frequencies) ü Time constraints for a periodic task: Ø s – starting point (first time the task require processing) Ø e – processing time Ø d – deadline Ø p – period (r – rate (r = 1/p)) Ø Ø d e time s 0≤e≤d (often d ≤ p: we’ll use d = p – end of period, but Σd ≤ Σp is enough) the kth processing of the task n n Ø p is ready at time s + (k – 1) p must be finished at time s + (k – 1) p + d the scheduling algorithm must account for these properties INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Real–Time Scheduling ü Resource reservation Ø Qo. S can be guaranteed Ø relies on knowledge of tasks Ø no fairness Ø origin: time sharing operating systems Ø e. g. , earliest deadline first (EDF) and rate monotonic (RM) (AQUA, Hei. TS, RT Upcalls, . . . ) ü Proportional share resource allocation Ø no guarantees Ø requirements are specified by a relative share Ø allocation in proportion to competing shares Ø size of a share depends on system state and time Ø origin: packet switched networks Ø e. g. , Scheduler for Multimedia And Real-Time (SMART) (Lottery, Stride, Move-to-Rear List, . . . ) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Earliest Deadline First (EDF) ü Preemptive scheduling based on dynamic task priorities ü Task with closest deadline has highest priority stream priorities vary with time ü Dispatcher selects the highest priority task ü Assumptions: Ø requests for all tasks with deadlines are periodic Ø the deadline of a task is equal to the end on its period (starting of next) Ø independent tasks (no precedence) Ø run-time for each task is known and constant Ø context switches can be ignored INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Earliest Deadline First (EDF) ü Example: priority A < priority B priority A > priority B deadlines Task A Task B time Dispatching INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Rate Monotonic (RM) Scheduling ü Classic algorithm for hard real-time systems with one CPU [Liu & Layland ‘ 73] ü Pre-emptive scheduling based on static task priorities ü Optimal: no other algorithms with static task priorities can schedule tasks that cannot be scheduled by RM ü Assumptions: Ø requests for all tasks with deadlines are periodic Ø the deadline of a task is equal to the end on its period (starting of next) Ø independent tasks (no precedence) Ø run-time for each task is known and constant Ø context switches can be ignored Ø any non-periodic task has no deadline INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Rate Monotonic (RM) Scheduling shortest period, highest priority ü Process priority based on task periods Ø task with shortest period gets highest static priority Ø task with longest period gets lowest static priority period length Ø dispatcher always selects task requests with highest priority ü Example: Task 1 longest period, lowest priority p 1 p 2 Task 2 P 1 < P 2 P 1 highest priority Dispatching INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
EDF Versus RM ü It might be impossible to prevent deadline misses in a strict, fixed priority system: deadlines Task A time Task B deadline miss Fixed priorities, A has priority, no dropping waste of time deadline miss Fixed priorities, A has priority, dropping Fixed priorities, B has priority, no dropping Fixed priorities, B has priority, dropping waste of time deadline miss Earliest deadline first deadline miss Rate monotonic (as the first) INF 5070 – media storage and distribution systems RM may give some deadline violations which is avoided by EDF 2005 Carsten Griwodz & Pål Halvorsen
EDF Versus RM NOTE: this means that EDF is usually more efficient than RM, i. e. , if switches are free and EDF uses resources ≤ 1, time then RM may need ≤ ln(2) resources to schedule the same workload ü EDF Ø dynamic priorities changing in Ø overhead in priority switching Ø Qo. S calculation – maximal throughput: all streams i Ri x ei ≤ 1, R – rate, e – processing time ü RM Ø static priorities based on periods Ø may map priority onto fixed OS priorities (like Linux) Ø Qo. S calculation: all streams i Ri x ei ≤ ln(2), R – rate, e – processing time INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
SMART (Scheduler for Multimedia And Real–Time applications) ü Designed for multimedia and real-time applications ü Principles Ø priority – high priority tasks should not suffer degradation due to presence of low priority tasks Ø proportional sharing – allocate resources proportionally and distribute unused resources (work conserving) Ø tradeoff immediate fairness – real-time and less competitive processes (short-lived, interactive, I/O-bound, . . . ) get instantaneous higher shares Ø graceful transitions – adapt smoothly to resource demand changes Ø notification – notify applications of resource changes ü Proportional shares no admission control INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
SMART (Scheduler for Multimedia And Real–Time applications) ü Tasks have importance and urgency Ø urgency – an immediate real-time constraint, short deadline (determine when a task will get resources) Ø importance – a priority measure n expressed by a tuple: [ priority p , biased virtual finishing time bvft ] n p is static: supplied by user or assigned a default value n bvft is dynamic: o o ü virtual finishing time: degree to which the share was consumed bias: bonus for interactive tasks Best effort schedule based on urgency and importance find most important tasks – compare tuple: T 1 > T 2 (p 1 > p 2) (p 1 = p 2 bvft 1 > bvft 2) sort after urgency (EDF based sorting) iteratively select task from candidate set as long as schedule is feasible INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling ü Tests performed Ø by IBM (1993) Ø executing tasks with and without EDF Ø on an 57 MHz, 32 MB RAM, AIX Power 1 ü Video playback program: Ø one real-time process n n n Ø read compressed data decompress data present video frames via X server to user process requires 15 timeslots of 28 ms each per second 42 % of the CPU time INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling 3 load processes (competing with the video playback) laxity (remaining time to deadline) the real-time scheduler reaches all its deadlines several deadline violations by the non-real-time scheduler task number INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling Varied the number of load processes Only video process laxity (remaining time to deadline) (competing with the video playback) 4 other processes 16 other processes NB! The EDF scheduler kept its deadlines task number INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling ü Tests again performed Ø by IBM (1993) Ø on an 57 MHz, 32 MB RAM, AIX Power 1 ü “Stupid” end system program: Ø 3 real-time processes only requesting CPU cycles Ø each process requires 15 timeslots of 21 ms each per second 31. 5 % of the CPU time each 94. 5 % of the CPU time required for real-time tasks INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling 1 load process laxity (remaining time to deadline) (competing with the real-time processes) the real-time scheduler reaches all its deadlines task number INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling 16 load process (competing with the real-time processes) laxity (remaining time to deadline) process 1 Regardless of other load, the EDF-scheduler reach its deadlines (laxity almost equal as in 1 load process scenario) process 2 NOTE: Processes are scheduled in same order process 3 task number INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Memory Management
Delivery Systems Network bus(es) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Delivery Systems application user space kernel space communication system file system F several in-memory data movements and context switches F several disk-to-memory transfers bus(es) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Memory Caching
Memory Caching application caching possible cache How do we manage a cache? ü how much memory to use? ü how much data to prefetch? ü which data item to replace? ü… file system communication system disk network card expensive INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Is Caching Useful in a Multimedia Scenario? ü High rate data may need lots of memory for caching… Buffer vs. Rate 160 Kbps (e. g. , MP 3) 1. 4 Mbps 3. 5 Mbps 100 Mbps (e. g. , uncompressed CD) (e. g. , average DVD video) (e. g. , uncompressed HDTV) 100 MB 85 min 20 s 9 min 31 s 3 min 49 s 8 s 1 GB 14 hr 33 min 49 s 1 hr 37 min 31 s 39 min 01 s 1 min 22 s 16 GB 133 hr 01 min 01 s 26 hr 00 min 23 s 10 hr 24 min 09 s 21 min 51 s 32 GB 266 hr 02 min 02 s 52 hr 00 min 46 s 20 hr 48 min 18 s 43 min 41 s Maximum amount of memory (totally) that a Dell Server can manage in 2004 – and all is NOT used for caching ü Tradeoff: amount of memory, algorithms complexity, gain, … ü Cache only frequently used data – how? (e. g. , first (small) parts of a broadcast partitioning scheme, allow “top-ten” only, …) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Need For Special “Multimedia Algorithms” ? In this case, LRU replaces ü Most existing systems use an LRU-variant the next needed frame. So Ø keep a sorted list the answer is in many cases Ø replace first in list YES… Ø insert new data elements at the end Ø if a data element is re-accessed (e. g. , new client or rewind), move back to the end of the list me est ti short access since LRU buffer e st tim longe access since ü Extreme example – video frame playout: play video (7 frames): 4 3 2 1 7 6 5 rewind and restart playout at 1: 1 7 6 5 4 3 2 playout 2: 2 1 7 6 5 4 3 playout 3: 3 2 1 7 6 5 4 playout 4: 4 3 2 1 7 6 5 INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
“Classification” of Mechanisms ü Block-level caching consider (possibly unrelated) set of blocks Ø each data element is viewed upon as an independent item Ø usually used in “traditional” systems Ø e. g. , FIFO, LRU, CLOCK, … Ø multimedia (video) approaches: n Least/Most Relevant for Presentation (L/MRP) n … ü Stream-dependent caching consider a stream object as a whole Ø related data elements are treated in the same way Ø research prototypes in multimedia systems Ø e. g. , n n n BASIC DISTANCE Interval Caching (IC) Generalized Interval Caching (GIC) Split and Merge (SAM) SHR INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
[M os er Least/Most Relevant for Presentation (L/MRP) et al. 95 ] ü L/MRP is a buffer management mechanism for a single interactive, continuous data stream Ø adaptable to individual multimedia applications Ø preloads units most relevant for presentation from disk Ø replaces units least relevant for presentation Ø client pull based architecture Homogeneous stream e. g. , MJPEG video Server Continuous Presentation Units (COPU) e. g. , MJPEG video frames Buffer Client request INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
[M os er Least/Most Relevant for Presentation (L/MRP) et al. 95 ] ü Relevance values are calculated with respect to current playout of the multimedia stream n presentation point (current position in file) mode / speed (forward, backward, FF, FB, jump) n relevance functions are configurable n COPUs – continuous object presentation units playback direction 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 current presentation point relevance value 1. 0 14 0. 8 12 0. 6 10 15 16 17 18 19 13 16 18 20 20 X referenced 21 23 11 0. 4 X history 22 22 25 24 0. 2 X skipped 24 26 26 0 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 INF 5070 – media storage and distribution systems COPU number 2005 Carsten Griwodz & Pål Halvorsen
[M os er Least/Most Relevant for Presentation (L/MRP) et al. 95 ] ü Global relevance value Ø each COPU can have more than one relevance value n n Ø bookmark sets (known interaction points) several viewers (clients) of the same = maximum relevance for each COPU Relevance global relevance value 1 loaded frames current presentation point S 1 current presentation point S 2 0 . . . 89 90 91 92 93 94 95 Bookmark-Set INF 5070 – media storage and distribution systems 96 97 98 99 History-Set 100 101 102 103 104 105 106 . . . Referenced-Set 2005 Carsten Griwodz & Pål Halvorsen
Least/Most Relevant for Presentation (L/MRP) ü L/MRP … J … gives “few” disk accesses (compared to other schemes) J … supports interactivity J … supports prefetching L … targeted for single streams (users) L … expensive (!) to execute (calculate relevance values for all COPUs each round) ü Variations: Ø Q-L/MRP – extends L/MRP with multiple streams and changes prefetching mechanism (reduces overhead) [Halvorsen et. al. 98] Ø MPEG-L/MRP – gives different relevance values for different MPEG frames [Boll et. all. 00] INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Interval Caching (IC) ü Interval caching (IC) is a caching strategy for streaming servers Ø caches data between requests for same video stream – based on playout intervals between requests Ø following requests are thus served from the cache filled by preceding stream Ø sort intervals on length, buffer requirement is data size of interval Ø to maximize cache hit ratio (minimize disk accesses) the shortest intervals are cached first S 11 12 : I 32 I 33 I 12 I 31 I 11 S 13 Video clip 1 I 21 S 12 S 11 I 12 I 11 S 22 S 21 Video clip 2 I 21 S 34 Video clip 3 S 33 INF 5070 – media storage and distribution systems S 32 I 32 S 31 I 31 2005 Carsten Griwodz & Pål Halvorsen
Generalized Interval Caching (GIC) ü Interval caching (IC) does not work for short clips Ø a frequently accessed short clip will not be cached ü GIC generalizes the IC strategy Ø manages intervals for long video objects as IC Ø short intervals extend the interval definition n Ø keep track of a finished stream for a while after its termination define the interval for short stream as the length between the new stream and the position of the old stream if it had been a longer video object the cache requirement is, however, only the real requirement cache the shortest intervals as in IC S 22 S 21 Video clip 2 I 21 S 12 11 Video clip 1 C 11 INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
LRU vs. L/MRP vs. IC Caching ü What kind of caching strategy is best (Vo. D streaming)? Ø caching effect global relevance values wasted buffering loaded page frames S 5 I 3 S 4 I 1 S 3 I 2 S 2 I 4 movie X S 1 Memory (L/MRP): 4 streams from disk, 1 from cache Memory (IC): 2 streams from disk, 3 from cache Memory (LRU): 4 streams from disk, 1 from cache INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
LRU vs. L/MRP vs. IC Caching ü What kind of caching strategy is best (Vo. D streaming)? Ø caching effect (IC best) Ø CPU requirement IC LRU L/MRP for each I/O request reorder LRU chain for each I/O request for each COPU RV = 0 for each stream tmp = r ( COPU, p, mode ) RV = max ( RV, tmp ) INF 5070 – media storage and distribution systems for each block consumed if last part of interval release memory element 2005 Carsten Griwodz & Pål Halvorsen
In-Memory Copy Operations
In Memory Copy Operations application expensive file system communication system disk network card INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Cost of Data Transfers memcpy() - 1. 7 GHz Pentium. IV ü Data copy operations are expensive Ø consume CPU, memory, hub, bus and interface resources (proportional to size) Ø profiling shows that ~40% of CPU time is consumed by copying data Ø speed-gap between memory and CPU increase Ø different access times to different banks ü System calls makes a lot of switches between user and kernel space Ø Ø ~450 ns in 2000 on 933 MHz Pentium. III ~920 ns in 2005 on 1. 7 GHz Pentium. IV INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Cost of Data Transfers ü THUS; data movement costs should be kept small Ø careful management of contiguous media data Ø avoid unnecessary physical copy operations Ø apply appropriate buffer management schemes Ø reduce overhead by removing physical in-memory copy operation, i. e. , ZERO-COPY data paths INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Basic Idea of Zero–Copy Data Paths application user space kernel space file system data_pointer communication system data_pointer bus(es) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Zero–Copy (Streaming) Mechanisms ü Linux: sendfile() Ø between two descriptors (file and TCP-socket) Ø bi-directional: disk-network and network-disk Ø need TCP_CORK ü AIX: send_file() Ø only TCP Ø uni-directional: disk-network Kernel streaming using zero-copy ü INSTANCE (MMBUF-based, in Net. BSDv 1. 5): Ø by Uni. K/IFI (2000) Ø uni-directional: disk-network (network-disk ongoing work) Ø stream_read() and stream_send() Application streaming (zero-copy 1) using zero-copy Ø stream_rdsnd() (zero-copy 2) ü splice(), stream(), IO-Lite, MMBUF, … INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
INSTANCE Zero–Copy Transfer Rate ü Zero-copy transfer rate limited by network card and storage system Ø saturated a 1 Gbps NIC and 32 -bit, 33 MHz PCI Ø read, write, with copy read, write, no copy read, automatic write, no copy Throughput increase of ~2. 7 times per stream (can at least double the number of streams) reduced processing time by approximately 50 % Ø huge improvement in number of concurrent streams INF 5070 – media storage and distribution systems approx. 12 Mbps approx. 6 Mbps 2005 Carsten Griwodz & Pål Halvorsen
A lot of research has been performed in this area!!!! BUT, what is the status today of commodity operating systems? Existing Linux Data Paths
Content Download application user space kernel space file system communication system bus(es) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Content Download: read / send application buffer read send kernel copy page cache DMA transfer Ø Ø copy socket buffer DMA transfer 2 n copy operations 2 n system calls INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Content Download: mmap / send application mmap send kernel page cache DMA transfer Ø Ø copy socket buffer DMA transfer n copy operations 1 + n system calls INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Content Download: sendfile application sendfile kernel gather DMA transfer page cache append descriptor socket buffer DMA transfer Ø 0 copy operations Ø 1 system calls INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Content Download: Results ü Tested transfer of 1 GB file on Linux 2. 6 ü Both UDP (with enhancements) and TCP UDP INF 5070 – media storage and distribution systems TCP 2005 Carsten Griwodz & Pål Halvorsen
Streaming application user space kernel space file system communication system bus(es) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Streaming: mmap / send application buffer mmap cork send uncork kernel copy page cache DMA transfer Ø Ø copy socket buffer DMA transfer 2 n copy operations 1 + 4 n system calls INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Streaming: mmap / writev application buffer mmap writev kernel copy page cache DMA transfer Ø Ø copy socket buffer DMA transfer 2 n copy operations 1 + n system calls Previous solution three less calls per packet INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Streaming: sendfile application buffer cork sendfile kernel uncork gather DMA transfer page cache append descriptor copy socket buffer DMA transfer Ø Ø n copy operations 4 n system calls INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Streaming: Results ü Tested streaming of 1 GB file on Linux 2. 6 ü RTP over UDP Compared to not sending an RTP header over UDP, we get an increase of 29% (additional send call) More copy operations and system calls required potential for improvements TCP sendfile (content download) INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Enhanced Streaming Data Paths
Enhanced Streaming: mmap / msend application mmap application buffer cork send kernel DMA transfer Ø Ø msend uncork send gather DMA transfer page cache msend allows to send data from an mmap’ed file without copy append descriptor copy socket buffer DMA transfer n copy operations Previous solution one more copy per packet 1 + 4 n system calls INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Enhanced Streaming: mmap / rtpmsend application mmap application buffer cork send kernel rtpmsend uncork gather DMA transfer page cache RTP header copy integrated into msend system call append descriptor copy socket buffer DMA transfer Ø Ø n copy operations 1 + n system calls previous solution require three more calls per packet INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Enhanced Streaming: mmap / krtpmsend application buffer An RTP engine in the kernel adds RTP headers rtpmsend kernel gather DMA transfer copy RTP engine page cache append descriptor socket buffer DMA transfer Ø Ø 0 copy operations previous solution require one more copy per packet 1 system call previous solution require one more call per packet INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Enhanced Streaming: rtpsendfile application buffer cork send RTP header copy integrated into sendfile system call sendfile uncork rtpsendfile kernel gather DMA transfer page cache append descriptor copy socket buffer DMA transfer Ø Ø n copy operations n system calls existing solution require three more calls per packet INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Enhanced Streaming: krtpsendfile application buffer An RTP engine in the kernel adds RTP headers rtpsendfile kernel gather DMA transfer copy RTP engine page cache append descriptor socket buffer DMA transfer Ø Ø 0 copy operations previous solution require one more copy per packet 1 system call previous solution require one more call per packet INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
Enhanced Streaming: Results ü Tested streaming of 1 GB file on Linux 2. 6 ü RTP over UDP sendfile based mechanisms sm ani ing ) eam ad) nlo nt d (co nte TCP ow sen d file (str Exi stin gm ech rov imp 7% ~2 ~2 5% imp rov em em ent mmap based mechanisms INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
The End: Summary
Summary ü All resources needs to be scheduled ü Scheduling algorithms for multimedia tasks have to… Ø … consider real-time requirements Ø … provide good resource utilization Ø (… be implementable) ü Memory management is an important issue Ø caching Ø copying is expensive ü Rule of thumb: watch out for bottlenecks Ø copying Ø data touching operations Ø frequent context switches (system calls) Ø scheduling of slow devices (disk) Ø. . . INF 5070 – media storage and distribution systems 2005 Carsten Griwodz & Pål Halvorsen
58a120151730549a9081c4de60a5d001.ppt