the transi on to handhelds
play

TheTransi*ontoHandhelds Powerwall: Previous genera*onsreusedso4ware - PowerPoint PPT Presentation

ParallelizingtheWebBrowser ChrisJones,RoseLiu,LeoMeyerovich KrsteAsanovic,andRastislavBodik ParLab UCBerkeley TheTransi*ontoHandhelds Powerwall: Previous


  1. Parallelizing
the
Web
Browser 
 Chris
Jones,
Rose
Liu,
Leo
Meyerovich
 Krste
Asanovic,
and
Rastislav
Bodik
 ParLab
 UC
Berkeley


  2. The
Transi*on
to
Handhelds
 Power
wall:
 Previous
 genera*ons
reused
so4ware
 Mainframe
 of
their
ancestors.

Mobiles
 will
need
parallel
so4ware.

 Mini
 Log
price WS
 PC
 Laptop
 Handset
 Ubiquitous
 Time
 Soon
on
mobile:
4‐cores
x
2‐threads
x
8‐SIMD
=
64‐way
parallelism
 2 


  3. Why
Parallelize
a
Browser?
 • Dominant
applica1on
pla2orm
 – easy
deployment:

apps
downloaded,
JS
portable
 – produc*ve
programming:
scrip*ng,
layout
 • …
but
not
on
handhelds
 – na*ve
frameworks
for:
iPhone,
Google
Android
 – slow:


for
Slashdot,
Laptop:
3s
=>

iPhone:
21s
 • Parallel
browser
may
need
new
architecture
 – ex:
JavaScript
relies
on
“gotos”,
is
too
serial


  4. Anatomy
of
a
Browser
 Frontend
 page?
 decompress
 web
servers
 lex
 parse
+
build
DOM
 Scrip1ng
 plugin
 script
 (decode
 image,
…)
 layout
 render
 Layout


  5. Project
Status
 1. Developed
 work‐efficient
 algorithms
 work‐efficient
:
no
more
work
than
sequen*al
algo.
 – layout:
 parallel‐map
with
a
*ling
op*miza*on 
 – layout :
break
up
tree
traversal
into
five
parallel
ones
 – lexing:
 specula*on
to
break
sequen*al
dependencies
 2. Reexamining
the
scrip1ng
programming
model
 – programmer
produc4vity :
from
callbacks
to
actors
 – performance :
adding
structure
to
detect
dependences


  6. Frontend:
Lexing
 page?
 decompress
 web
servers
 lex
 parse
+
build
DOM
 plugin
 script
 (decode
 image,
…)
 layout
 render


  7. Lexing,
from
10,000
feet
 Goal :
given
lexical
spec
and
input,
find
lexemes
 Σ
–
{‘>‘}
 STag ::= <[^>]*> ‘/’
 Content ::= [^<]+ ETag ::= </[^>]*> Σ
–
{‘>‘}
 Σ
–
{‘/‘}
 ‘<’
 Σ
–
{‘<‘}
 STag
 Σ
–
{‘<‘}
 <
 b
 >
 B
 e
 r
 k
 e
 l
 e
 y
 !
 <
 /
 b
 >
 (label
each
character

with
its
state)


  8. Inherently
Sequen*al?
 Σ
–
{‘>‘}
 STag ::= <[^>]*> ‘/’
 Content ::= [^<]+ ETag ::= </[^>]*> Σ
–
{‘>‘}
 Σ
–
{‘/‘}
 ‘<’
 Σ
–
{‘<‘}
 ?
 STag
 Σ
–
{‘<‘}
 <
 b
 >
 B
 e
 r
 k
 e
 l
 e
 y
 !
 <
 /
 b
 >
 Processor
1
 Processor
2
 …


  9. An
observa*on
 In
lexing,
irrespec*ve
of
where
DFA
starts,
it
 converges
to
a
 stable,
recurring 
state
 Lexing:
 <
 b
 >
 B
 e
 r
 k
 e
 l
 e
 y
 !
 <
 /
 b
 >
 “in
ETag”
 
 start
state 
 “in
Content”
 
 Parallel
scans
thus
need
not
scan
from
all
possible
states,
 9 
 just
one,
yielding
a
work‐efficient
algorithm.


  10. Our
solu*on
(1/2):
Par**on

 • split
input
into
blocks
with
 k ‐character
overlap
 • scan
in
parallel;
start
block
from
a
 tolerant
 state 
 …
 …
 …
 …
 …
 k Processor
1
 …
 Processor
2
 …
 …
 …
 …


  11. Our
solu*on
(2/2):
Speculate
 • split
input
into
blocks
with
 k ‐character
overlap
 • scan
in
parallel;
start
block
from
a
 tolerant 
state 
 • check
if
blocks
converge:
expected
in
 k ‐overlap
 • specula*on
may
fail;
if
so,
block
is
rescanned
 …
 …
 …
 …
 …
 …
 …
 …
 …
 …
 …


  12. Speedup:
Flex
vs
Cell
 today’s
page
 sizes :
5
cores
 are
4.5x
faster
 than
flex
 baseline :
(sequen*al)
flex
on
the
CELL
main
CPU


  13. Layout
Solving
(1/2)
 page?
 decompress
 web
servers
 lex
 parse
+
build
DOM
 plugin
 script
 (decode
 image,
…)
 layout
 render


  14. Rule
Matching
 Goal: 
Match
rules
with
nodes:
 <body>
 – a
rule:
p
img
{
fontsize:
7px}
 <p>
 <p>
 – match
tag
path
 – path‐rule
matching
 <img>
 <b>
 ok
 ok
 ok
 ok
 hello
 ok
 • end
with
the
same
node
 • and
are
a
substring
 world
 selectors
 p
 img
 p
img
 proper1es
 height=83%
 width=100px
 fontsize=7px
 float=le4


  15. Paralleliza*on
 • 1000s
nodes,
1000s
rules
 <body>
 • Assign
nodes
to
cores
 <p>
 <p>
 <img>
 <b>
 ok
 ok
 ok
 ok
 hello
 ok
 world
 selectors
 p
 img
 p
img
 proper1es
 height=83%
 width=100px
 fontsize=7px
 float=le4


  16. Tiling
for
Caches
 Problem:
all
the
nodes
+
selectors
might
not
fit
in
cache!

 <body>
 <p>
 <p>
 hello
 <img>
 <b>
 ok
 ok
 ok
 ok
 ok
 world
 selectors
 p
 img
 p
img
 proper1es
 height=83%
 width=100px
 fontsize=7px
 float=le4


  17. Speedup
(Cilk++)
 Speedup
vs.
Fastest
Sequen1al
 (Slashdot)
 5
 Redundancy
opt.
+
*ling(Cilk)
 4.5
 4
 Naïve
+
*ling
(Cilk)
 Speedup
 3.5
 3
 2.5
 2
 1.5
 Redundancy
opt.
+
*ling(seq.)
 1
 0.5
 Naïve
(Cilk)
 0
 Naïve
(seq)
 1
 2
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 Cores
 2
socket
x
4
core
x
2
thread
(2.6
Ghz,
12x
1
GB)


  18. Layout
Solving
(2/2)
 page?
 decompress
 web
servers
 lex
 parse
+
build
DOM
 plugin
 script
 (decode
 image,
…)
 layout
 render


  19. Problem:
Layout
a
Page
 h=40
 w=100,
fs=12
 x=0,
y=0
 w=100,
fs=12
 <body>
 fs,
Δ,
w
 Δ
 fs,
Δ,
w
 fs=50%
 w=100,
fs=6
 <p>
 <p>
 w=100,
fs=12
 x=0,
y=0
 Δ
 x=0,
y=10
 h=10
 h=40
 fs,
Δ,
w
 Δ
 fs,
Δ,
w
 fs,Δ,w
 w=100,
fs=12
 x=0,
y=10
 <img>
 hello
 <b>
 ok
ok
ok
ok
 ok
 h=10
 w=40,
fs=6
 w=50,
float=le4
 w=50
 x=0,
y=0
 x=0,
y=10
 h=10
 fs,
Δ,
w
 h=20
 w=30,
fs=12
 x=50,
y=10
 h=10
 world


  20. It
looks
rather
sequen*al..
 h=40
 w=100,
fs=12
 x=0,
y=0
 w=200,
fs=12
 <body>
 fs,
Δ,
w
 Δ
 Δ
 fs,
Δ,
w
 fs,
Δ,
w
 fs=50%
 w=100,
fs=6
 <p>
 <p>
 w=100,
fs=12
 w=100,
fs=12
 x=0,
y=0
 Δ
 x=0,
y=10
 x=0,
y=10
 h=10
 h=40
 fs,
Δ,
w
 Δ
 Δ
 fs,
Δ,
w
 fs,
Δ,w
 w=100,
fs=12
 x=0,
y=10
 <img>
 hello
 <b>
 ok
ok
ok
ok
 ok
 h=10
 w=40,
fs=6
 w=40,
fs=6
 w=50,
float=le4
 w=50
 x=0,
y=0
 x=0,
y=0
 x=0,
y=10
 h=10
 h=10
 fs,
Δ,
w
 h=20
 w=30,
fs=12
 x=50,
y=10
 h=10
 world


  21. But
not
en*rely
 h=40
 w=100,
fs=12
 x=0,
y=0
 w=200,
fs=12
 <body>
 fs,
Δ,
w
 Δ
 fs,
Δ,
w
 fs,
Δ,
w
 fs=50%
 w=100,
fs=6
 <p>
 <p>
 w=100,
fs=12
 x=0,
y=0
 Δ
 x=0,
y=10
 h=10
 h=40
 fs,
Δ,
w
 fs,
Δ,
w
 Δ
 fs,
Δ,
w
 fs,
Δ,w
 w=100,
fs=12
 x=0,
y=10
 <img>
 hello
 <b>
 ok
ok
ok
ok
 ok
 h=10
 w=40,
fs=6
 w=50,
float=le4
 w=50
 x=0,
y=0
 x=0,
y=10
 h=10
 fs,
Δ,
w
 fs,
Δ,
w
 h=20
 w=30,
fs=12
 w=30,
fs=12
 x=50,
y=10
 x=50,
y=10
 h=10
 h=10
 world


  22. 5
Phases:
Each
Exhibits
Tree
Parallelism
 w p =80,
w m =40
 w=100,
fs=12
 fs=12
 <body>
 <body>
 fs=50%
 w p =40
 w m =40
 w p =80
 fs=6
 w m =30
 <p>
 <p>
 <p>
 fs=12
 <p>
 w p =30
 w m =30
 fs=12
 ok
ok
ok
ok
 hello
 <b>
 <img>
 ok
 ok
ok
ok
ok
 hello
 <b>
 ok
 <img>
 float
=
le4
 fs=6
 fs=12
 fs=12
 w p =40
 w p =10
 w p =50
 w m =40
 w m =10
 w m =50
 world
 world
 fs=12
 w p =30,
w m =30
 Phase
1:
font
size,
temporary
width
 Phase
2:
preferred
max
&
min
width
 Phase
3:
solved
width
 Phase
4:
height,
rela1ve
x/y
posi1on 
 Phase
5:
absolute
x/y
posi1on


Recommend


More recommend