﻿<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<?xml-stylesheet href="xbl-shape-bindings.css" type="text/css"?>

<html xmlns="http://www.w3.org/1999/xhtml"
	xmlns:mml="http://www.w3.org/1998/Math/MathML"
	xmlns:svg="http://www.w3.org/2000/svg" 
	xmlns:xlink="http://www.w3.org/1999/xlink"
>

<head>
  <title>Self-Improving Algorithms for Delaunay Triangulations</title>
<!-- metadata -->
  <meta name="generator" content="S5" />
  <meta name="version" content="S5 1.1" />
  <meta name="presdate" content="20050128" />
  <meta name="author" content="Ken Clarkson &bull;" />
  <meta name="company" content="IBM Almaden" />
<!-- configuration parameters -->
  <meta name="defaultView" content="slideshow" />
  <meta name="controlVis" content="hidden" />
<!-- style sheet links -->
  <link rel="stylesheet" href="ui/default/slides.css" type="text/css"
 media="projection" id="slideProj" />
  <link rel="stylesheet" href="ui/default/outline.css" type="text/css"
 media="screen" id="outlineStyle" />
  <link rel="stylesheet" href="ui/default/print.css" type="text/css"
 media="print" id="slidePrint" />
  <link rel="stylesheet" href="ui/default/opera.css" type="text/css"
 media="projection" id="operaFix" />
<!-- embedded styles -->
  <style type="text/css" media="all">
.imgcon {width: 525px; margin: 0 auto; padding: 0; text-align: center;}
#anim {width: 270px; height: 320px; position: relative; margin-top: 0.5em;}
#anim img {position: absolute; top: 42px; left: 24px;}
img#me01 {top: 0; left: 0;}
img#me02 {left: 23px;}
img#me04 {top: 44px;}
img#me05 {top: 43px;left: 36px;}
  </style>
  <style type="text/css" media="all">
     .demo {display: block; padding: 0.5em 0.5em 0.5em; margin: 0 1.5em 0.5em; font-size: 90%;}
    .floatright {float : right;}
  </style>
  
      <style>
      [class~="circle"] 
      {
        stroke: red;
        stroke-width: 2;
        fill: red;
        fill-opacity: 0.1;
      }
     <style>
		[class~="circ_control"]:hover {stroke:black; stroke-width:2; fill-opacity:0.2;}
	</style>
    </style>

  <script src="ASCIIMathML.js" type="text/javascript" />
 <!--  <script src="impl.js" type="text/javascript" />-->
 <!-- S5 JS -->
  <script src="ui/default/slides.js" type="text/javascript" />
  <script type="text/javascript">
	AMsymbols = AMsymbols.concat([
	{input:">>", tag:"mo", output:"\u226B", tex:"gg"},
	{input:"ll", tag:"mo", output:"\u226A", tex:"ll"},
	{input:"sgn",  tag:"mo", output:"sgn", tex:null, ttype:CONST},
	{input:"exp",  tag:"mo", output:"exp", tex:null, ttype:CONST},
	{input:"Pr",  tag:"mo", output:"Pr", tex:null, ttype:CONST},
	{input:"argmax",  tag:"mo", output:"argmax", tex:null, ttype:UNDEROVER},
	]);
  </script>
</head>
<body>
<div class="layout">
   <div id="controls">
    <form action="#" id="controlForm" onmouseover="showHide('s');" onmouseout="showHide('h');">
      <div id="navLinks" class="hideme">
        <a accesskey="t" id="toggle" href="javascript:toggle();">&#216;</a>
        <a accesskey="z" id="prev" href="javascript:go(-1);">&laquo;</a>
        <a accesskey="x" id="next" href="javascript:go(1);">&raquo;</a>
      <div id="navList" ><select id="jumplist" onchange="go('j');"></select></div>
      </div>
    </form>
  </div>
<div id="currentSlide"><!-- DO NOT EDIT --></div>
<div id="footer">
   <h1>Self-Improving Algorithms</h1>
   <h2>Ken Clarkson / IBM Almaden</h2>
</div>

</div>

<ol class="xoxo presentation">

  <li class="slide"><h1>Self-Improving Algorithms for Delaunay Triangulations</h1>
	<br/>
    <h3 style = "color:orange">Nir Alon, Bernard Chazelle, <span style="color:blue">Ken Clarkson*</span>,
    Ding Liu, <span style="color:gray">C. Seshadhri</span>, Wolfgang Mulzer</h3>
    <h3><span style="color:blue">*IBM Almaden</span><br/>
        (all others: <span style="color:orange">Princeton Univ.</span>, or <span style="color:gray">both</span>)</h3>
      <div width="400" style="position:absolute; bottom:0.75in; right:2in;">
		<canvas id="title_canvas" width="300" height="300"></canvas>
<!--	<applet code="jvLite.class" archive="../../../web/enets/javaView/jvLite.jar" name="JavaView"
				width="350" height="350" style="float:right;" hspace="10" vspace="10" codebase="./">
		<param name="Model" value="javaView/sphere.jvx"/>
		<param name="displayFile" value="javaView/sphere.jvd"/>
		<param name="autoRotate" value="Show"/> 
		<param name="background" value="255;255;255"/>
		<param name="Border" value="Hide"/>
		<param name="Antialias" value="Show"/>
		<param name="Depthcue" value="Hide"/>
      </applet>-->

     </div>
  </li>

  <li class="slide"><h1>Outline</h1>
    
    <ul>
      <li>Self-improving algorithms</li>
      <li>...for sorting</li>
      <ul><li>Sketch of analysis</li></ul>
      <li>...for Delaunay triangulation</li>
      <ul><li>Sketch of algorithm</li></ul>
    </ul>
  </li>

  <li class="slide"><h1>Sequences of Computational Problems</h1>
    <ul>
      <li>A computational problem: given dataset `I`, compute `f(I)`</li>
      <ul>
	<li class="incremental">`I=` a set of `n` values,<br/>
          `f(I)=` their sorted order, the rank of each input value</li>
	<li class="incremental">`I=` a set of `n` points in the plane,<br/>
          `f(I)=`  their Delaunay triangulation</li>
      </ul>
      <span class="incremental">
      <li>Problem sequence:</li>
      <ul>
        <li>Given datasets `I_1`, `I_2`, `I_3, ...` in turn,</li>
        <li>Compute `f()` of each in turn</li>
      </ul>
      <li>Sometimes the `I_k` are like each other in some way</li>
      </span>
    </ul>
    </li>  
      
  <li class="slide"><h1>Self-Improvement</h1>
      <ul>
	  <li>We show:</li>
          <ul>
            <li>When computing `f(I_k)`,</li>
            <li>experience of computing `f(I_1)`, `f(I_2), ..., f(I_{k-1})` can sometimes help</li>
            <li>More precisely: the previous datasets `I_1...I_{k-1}`, and some additional work</li>
          </ul>
          <span class="incremental">
	  <li>We found <em>self-improving algorithms</em> for sorting and Delaunay triangulations</li>
	  <li>Self-improvement here means <em>using past experience to run faster</em></li>
	  <ul>
	    <li>An instance of learning</li>
	    <li>Often "learning" = "using past experience to classify better"</li>
	    <li>But generally, "learning" = "using past experience to perform better"</li>
	  </ul></span>
      </ul>
    </li>
    
    

 
  <li class="slide"><h1>Random Data and Comparisons</h1>
    <ul >
	<li>We assume that the datasets `I_k` are random, with the same distribution</li>
	<ul><li>The random variables are entire <em>sets</em>,<br/> not individual values or points</li></ul>
	<li>Thus each `f(I_k)` is also a random variable</li>
      <span class="incremental">
	<li>Our algorithms are <em>comparison-based</em></li>
	<ul><li>They ask a series of yes/no questions</li></ul>
	<li>How many comparisons are needed?</li>
      </span>
    </ul>
  </li>


  <li class="slide"><h1>Identifying the Output using Comparisons</h1>
    <ul>
	<li>Enough questions must be asked about instance `I` to tell the different `f(I)` apart</li>
	<ul>
	  <li>If there are eight possible outputs, two questions about the input may not be enough </li>
	</ul>
	<li>The set of comparisons done by the algorithm determine `f(I)`</li>
	<li>To use as few comparisons as possible,<br/>
          use more for `f(I)` that are less likely</li>
    </ul>
  </li>
  
  <li class="slide"><h1>Entropy Lower Bounds</h1>
    <ul>
        <li>These ideas suggest that the <em>entropy</em> of `f(I)` determines the number of comparisons needed</li>
        <li class="incremental">Suppose you want to send `f(I_1), f(I_2),...` over a communication channel</li>
        <li class="incremental">You could encode each `f(I)` by the bit sequence of the comparison results</li>
        <li class="incremental">The best encoding takes at least the entropy `H(f(I)) := sum_y Pr(y) log(1// Pr(y))` bits</li>
	<li class="incremental">So: the optimal expected number of comparisons is at least `H(f(I))`</li>
     </ul>
  </li>
  
  <li class="slide"><h1>Meeting the Entropy Lower Bounds</h1>
    <ul>

	<li>We give algorithms that use `O(n + H(Y))` comparisons</li>
	<ul>
	  <li>...and `text{Work} = O(text{Comparisons})</li>
	  <li>That is, optimal</li>
	  <li>With a lot of storage: `Theta(n^2)`</li>
	</ul>
      <span  class="incremental">
	<li>A tradeoff: for given `epsilon in (0,1]`,</li>
	<ul>
	  <li>`1//epsilon` times the `O(n + H(Y))` comparisons</li>
	  <li>`n^{1+epsilon}` <em>training instances</em> : the `I_1...I_{k-1}`</li>
	  <li>`n^{1+epsilon} log n` space</li>
	</ul>
      </span>
    </ul>
  </li>
  
  <li class="slide"><h1>These Results Are <em>Not</em> About </h1>
    
    <ul>
      <li>Using any structure within a given instance</li>
      <ul>
	<li>Such as, the data is nearly sorted</li>
      </ul>
      <li>Assuming that the distribution of the instances is known</li>
      <li>Assuming that the distribution has any special properties</li>
      <ul>
	<li>(That is, beyond the independence condition described next)</li>
      </ul>
      <li>Using training instances as random samples</li>
      <ul>
	<li>(Well, it kind of is, but not quite)</li>
      </ul>
    </ul>  
  </li>      
      
      
      
  <li class="slide"> <h1>An Additional Condition: Independence</h1>
   <ul>
    
       <li>For input set `I = {x_1, x_2,... , x_n}`,<br/>
          we also require each `x_i` to be an independent random variable</li>
       <ul><li>This does <em>not</em> eman the `x_i` are identically distributed, or that the distributions
        are known</li></ul>
       <li>That is, `I` has a product distribution `D := prod_i bb D_i`</li>
    <span  class="incremental">
       <li>While other restrictions might also help,<br/>
          <em>some</em> additional condition on `D` is required for good bounds</li>
       <li>We show that for general distributions `D`,
	  exponential space is needed for target running time `O(n + H(Y))`</li>
    </span>
   </ul>
 </li>
  

  <li class="slide"><h1>Sorting : The Typical Set `V`</h1>
   <ul>
      <li>The sorting algorithm uses a set `V` of "typical" values, and a collection of search trees</li>
      <li>`V` is built as follows:</li>
       <ul>
        <li>Take `lambda` training instances `I_1...I_{lambda}`</li>
        <ul>
            <li>`lambda := c log n`, for value `c` to be determined</li>
        </ul>
        <li>Merge all values `I_1 cup I_2 cup ... cup I_{lambda}` to make a sorted list `J` with `lambda n` values</li>
        <li>We put each `lambda`-th value of `J` into the list `V` of `n` values</li>
      </ul>
             <span  class="incremental">
       <li>`V` represents the overall distribution of `I`</li>
       <ul>
	<li>We expect one value of `I` in each interval `[v_j, v_{j+1})</li>
      </ul>
       </span>
      <br/> <center><img src="f/V_list.jpg"/></center>
   </ul>
 </li>
  
  <li class="slide"><h1>Sorting : Search Trees `T_i`</h1>
   <ul>
       <li>To use `V`,<br/>
          we build a binary search tree `T_i` on `V`<br/>
          for each input distribution `D_i`</li>
       <li>`T_i` is built so that its search cost for `x_i` is the optimal `H(D_i)`</li>
             <span  class="incremental">
       <li>More precisely:</li>
       <ul>
	<li>The random variable associated with the search is the bucket `b_i:= [v_j, v_{j+1})` containing `x_i`</li>
        <li>The search cost is `H(b_i) le H(D_i)`</li>
	<li>Additional training instances are used to estimate `D_i` and build `T_i`</li>
      </ul>
       </span>
   </ul>
 </li>
  
  <li class="slide"><h1>Sorting : The Algorithm</h1>
   <ul>
       <li>The algorithm is:</li>
       <ul>
	 <li>For each `i=1..n`, locate `x_i` in the buckets using `T_i`</li>
                <span  class="incremental">
         <li>Sort the set of values falling in each bucket</li>
         <ul><li>`O(1)` values/bucket implies `O(1)` work/bucket</li></ul>
      </span>
       </ul>
        <span  class="incremental">        
       <li>Total work for sorts in all buckets is `O(n)`, searches are entropy-optimal</li>
       <li>So we're done, right?</li>
       </span>
   </ul>
 </li>
   
  <li class="slide"><h1>Analysis</h1>
    
    <ul>
      <li>We're not quite done</li>
      <li>Although the `T_i` are individually optimal,<br/>
        it hasn't been show their cost is small</li>
      <li>It remains to show that `sum_i H(b_i) = O(n+H(Y))`</li>
      <ul><li>Independence implies that `H(b_1, b_2,...b_n) = sum_i H(b_i)`</li></ul>
      <li>That is, there is at least as much information in the output ranking `Y=f(I)`
        as in bucket assigments, up to additive `O(n)`</li>
    </ul>
  </li>
  
  <li class="slide"> <h1>Analysis via Encoding</h1>
    <ul>
      <li>Suppose `b := (b_1,...,b_n)` can be computed from the output ranking `Y`,<br/>
          using `O(n)` additional comparisons</li>
      <li>Then the total number of bits needed to encode `b` is at most `O(n + H(Y))`</li>
      <ul>
	<li>Encoding of `b` is: a good encoding of `Y`,
            plus bits representing comparison outcomes</li>
      </ul>
      <span  class="incremental">
      <li>`b` can computed from such an encoding as desired:</li>
      <ul>
        <li>Sort the values `x_i` using `Y`;</li>
        <li>Merge that sorted list with `V`</li>
      </ul>
      </span>
    </ul>
  </li>
  
  <li class="slide"><h1>Delaunay Triangulations</h1>
    <ul><div style="display:block; float:right;"><img src="f/delaunay.png"/></div>
      <li>Given a set `I` of points,<br/>
        its Delaunay triangulation is a planar subdivision whose vertices are the points in `I`</li>
      <li><b>If</b> a triangle `t` has:</li>
	<ul>
	  <li>Vertices from `I`, and </li>
	  <li>No points of `I` in its circumscribed circle</li>
	</ul>
      <li><b>Then</b> `t` is a Delaunay triangle</li>
      <li>A Delaunay triangulation comprises all such Delaunay triangles</li>
      <ul><li>(Ignoring the unbounded parts)</li></ul>
    </ul>
  </li>
  
  <li class="slide"><h1>Sorting vs. Triangulation</h1>
    
    <ul>
      <li>Delaunay triangulation is like sorting, only more complicated</li>
      <ul><li>Actually, sorting can be reduced to Delaunay triangulation</li></ul>
      <li>That is:</li>
      <ul>
        <li>We can view sorting as:<br/>
          find all open intervals `(x_i, x_{i'})` that contain no values of `I`</li>
        <li>We can view finding the Delaunay triangulation as:<br/>
          find all disks inscribed on `{p_i, p_{i'}, p_{i''}}` that contain no points of `I`</li>
        <ul><li>The <em>Delaunay disks</em></li></ul>
        </ul>
      <li>Our algorithm and analysis for triangulation generalizes that for sorting</li>
    </ul>
  </li>
  
  <li class="slide"><h1>Triangulation: the Typical Set `V`</h1>
    
    <ul><img src="f/V_net.jpg" align="right" />
      <li>As for sorting, our algorithm for triangulation also builds and uses a "typical" set `V`</li>
      <li>`V` is a subset of `J := I_1 cup I_2 ... cup I_lambda`, `lambda = O(log n)`</li>
      <li>`V` is a range space `epsilon`-net of `J`, with `epsilon := 1//n`</li>
      <ul>
        <li>Such a net has the following property, for any disk `d`:<br/>
          <b>If</b> disk `d` contains no points of `V` <br/>
          <b>Then</b> `d` contains fewer than `epsilon lambda n = O(lambda)` points of `J`
        </li>
        <li>Such sets, of size `O(1//epsilon) = O(n)` exist [MRW90][CV07]</li>
        <li>Slightly larger random subsets are also `epsilon`-nets [HW97][C97]</li>
      </ul>
    </ul>
  </li>
    
    
  <li class="slide"> <h1>More About `V`</h1>
    
    <ul>
      <li>Any disk containing no points of `V` will contain an
        expected `O(1)` points of `I`</li>
      <li>We use `T(V)`, the Delaunay triangulation of `V`</li>
      <li>By construction, each Delaunay disk of `V` will contain expected `O(1)` points of `I`</li>
    </ul>
  </li>

  <li class="slide"><h1>Triangulation: Search Trees `T_i`</h1>
    
    <ul>
      <li>As for sorting, our algorithm for triangulation also builds and
        uses optimal search data structures `T_i`</li>
      <li>For sorting, `T_i` was a binary search tree</li>
      <li>For triangulation, `T_i` is a data structure for planar point location</li>
      <li>`T_i` allows fast search for the location of `p_i` in the triangulation of `V`</li>
      <li>The triangle `b_i` containing `p_i` is a random variable, since `p_i` is</li>
      <li>The fastest possible expected time to determine `b_i` is `H(b_i)`</li>
      <li>`T_i` is a data structure with such search time [AMMW07]</li>
    </ul>
  </li>
  
  <li class="slide"><h1>Triangulation: the Algorithm</h1>
    
    <ul>
      <li>For sorting, `T_i` is used to bucket `x_i`, and subsorts are done in each bucket</li>
      <li>For triangulation, `T_i` is used to find `b_i`, and that information is used to allocate `p_i`
      to `O(1)` subtriangulation subproblems</li>
      <ul><li>Each subproblem built from points in three Delaunay disks of `V`</li></ul>
      <li>Each subtriangulation is on `O(1)` expected points</li>
      <li>The subtriangulations can be put together, to get a triangulation of `V cup I`</li>
      <li>For sorting, it is trivial to get the sorted version of `I` from the sorted list `V cup I`</li>
      <li>For triangulation, we apply a linear-time randomized algorithm to get `T(I)` from `T(V cup I)` [ChDHMST02]</li>
    </ul>
  </li>
  
  <li class="slide"><h1>Analysis: Encoding</h1>
    
    <ul>
      <li>As for sorting, even though the various steps are optimal in some respects,
      we're not done</li>
      <li>To show optimality, we need to show that the entropy of `b`
        is `O(n + H(T(I)))`</li>
      <li>As before: from `T(I)` we can obtain the `b_i` using an algorithm that needs
      `O(n)` comparisons, and this implies the result</li>
      <li>For sorting, a key step was merging the sorted lists `V` and `I`</li>
      <li>For triangulation, the analog is merging the triangulations `T(V)` and `T(I)`</li>
      <ul><li>Linear time using a polytope intersection algorithm[Ch92]</li></ul>
      <li>From `T(V cup I)`, we can obtain the `b_i` without too much pain</li>
    </ul>
  </li>
  

  
  <li class="slide"><h1>All the Analogies</h1>
    
    
<table style="font-size: 0.75em; text-align: left; width: 100%;" border="1"
 cellpadding="2" cellspacing="1">
  <tbody>
    <tr>
      <th style="font-weight: bold;">Sorting</th>
      <th style="font-weight: bold;">Delaunay Triangulation</th>
    </tr>
    <tr>
      <td>Intervals `(x_i, x_{i'})` containing no values of `I`</td>
      <td>Delaunay disks</td>
    </tr>
    <tr>
      <td>Typical set `V`</td>
      <td>Range space `epsilon`-net `V` [MRW90, CV07],<br/>Ranges are disks, `epsilon = 1//n`
      </td>
    </tr>
    <tr>
      <td>`log n` training instance points in each bucket</td>
      <td>`log n` training instance points in each disk</td>
    </tr>
    <tr>
      <td>Expect `O(1)`&nbsp;values of &nbsp;`I` in each bucket</td>
      <td>Expect `O(1)` points in each D. disk of `V`</td>
    </tr>
    <tr>
      <td>Optimal weighted binary trees `T_i`</td>
      <td>Entropy-optimal planar point location data structures `T_i` [AMMW07]</td>
    </tr>
    <tr>
      <td>Sorting within buckets `->` sorted list of `V cup I`</td>
      <td>Triangulation within small regions `-> T(V cup I)`</td>
    </tr>
    <tr>
      <td>Removal of `V` from sorted `V cup I` (trivial)</td>
      <td>Construction of `T(I)` from `T(V cup I)` [ChDHMST02]</td>
    </tr>
    <tr>
      <td>In analysis: merge of sorted `V` and `I`</td>
      <td>In analysis: merge of `T(V)` and `T(I)` [Ch92]</td>
    </tr>
    <tr>
      <td>In analysis: recovery of buckets `b_i` from sorted `V cup I` (trivial)</td>
      <td>In analysis: recovery of triangles `b_i` in `T(I)` from `T(V cup I)`</td>
    </tr>
  </tbody>
</table>
</li>
  
   

    
  <li class="slide"><h1>Concluding Remarks</h1>
    
    <ul>
        <li>The results are pleasingly tight, but maybe a little too expensive</li>
        <li>Novelty for me: the coding arguments</li>
        <li>Are there stronger conditions that imply cheaper algorithms?</li>
        <li>Are there broader conditions that allow interesting results?</li>
        <ul><li>Without full independence of each `x_i`, for example</li></ul>
     </ul>
    <br/><center style="color:red">Thank you for your attention</center>
</li>





         
         
</ol>
</body>
</html>
