%Example of use of oxmathproblems latex class for problem sheets
%\documentclass{oxmathproblems}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% University of Oxford, Mathematical Institute LaTeX Problem Sheet class
% Created by K A Gillow, 12 September 2018
% Last updated 12 May 2020
%
% this latex package is designed for use with the standard exam.cls
% exam.cls is standard in texlive and miktex
% exam.cls also available at
% http://www.ctan.org/tex-archive/macros/latex/contrib/exam/
% http://math.mit.edu/~psh/#LaTeX
% exam.cls has numerous options, see the 130+ page doc examdoc.pdf
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\NeedsTeXFormat{LaTeX2e}
\ProvidesClass{oxmathproblems}[2020/05/12 v1.3 Oxford Maths problem sheet class]
\usepackage{amssymb}
% load in the main exam class with appropriate options
\LoadClass[12pt,a4paper]{exam}
%space out the lines by 25% more than the default (basically 1.5 line spacing)
\linespread{1.25}
% General thinking is that 12pt font, with 1.25 spread, black on white text,
% gives a good readable document for most people. For those that benefit
% from larger fonts many will be best zooming / expanding the document on
% screen or simply printing enlarged on A3 as that will best preserve the
% formatting rather than enlarging the font whilst still fitting into an A4
% sized space. For those that benefit from a different background colour
% this is likely easiest achieved if the document is on white and then
% software or a film is used to change the white to cream, blue etc
%load in some commonly needed packages
\RequirePackage{ifthen,amsmath,amssymb,amsthm,bm,graphicx,color}
%increase the printed page area width
\extrawidth{0.5cm}
%increase header space on title page only
\extraheadheight[1.5cm]{0cm}
%setup page headers/footers for first and subsequent pages
\pagestyle{headandfoot}
\lhead{}
\chead[\bfseries \Large \thecourse\\
\ifprintanswers \textit{\textcolor{red}{For Tutors Only --- Not For Distribution}}\\ \fi
\thesheettitle]{}
\cfoot{}
\rfoot{Page \thepage{} of \numpages}
\footrule
% define newcommands for user set page header details
\newcommand*{\oxfordterm}[1]{\def\theoxfordterm{#1}}
\newcommand*{\course}[1]{\def\thecourse{#1}}
\newcommand*{\sheetnumber}[1]{\def\thesheetnumber{#1}}
\newcommand*{\sheettitle}[1]{\def\thesheettitle{#1}}
\oxfordterm{}\course{}\sheetnumber{}\sheettitle{}
% define a command for user to set some contact info in the footer
\newcommand*{\contact}[1]{\def\thecontact{#1}}
\contact{}
% revised question command that tries to encourage page breaks
% to lie between questions rather than within questions
\newcommand{\miquestion}[1][]{\filbreak
\ifthenelse{\equal{#1}{}}{\question}{\question[#1]}
}
% do not put a box/frame around printed solutions
\unframedsolutions
\renewcommand{\subpartlabel}{(\thesubpart)}
%if you produce a problem sheet with solutions but want to strip the
%solutions from the tex file (e.g. to give to a student who needs to use a
%screen reader) you could run one of
%sed '/\\begin{solution}/,/\\end{solution}/d' file.tex > file-nosoln.tex
%perl -ne 'print unless /^\\begin\{solution\}/ .. /^\\end\{solution\}/' file.tex > file-nosoln.tex
%----------------------------------------------------------------------------
%(un)comment this line to enable/disable output of any solutions in the file
%\printanswers
%define the page header/title info
\course{Algorithm Design and Analysis (Fall 2023)}
\sheettitle{Assignment 1\\Deadline: Nov 1, 2023} %can leave out if no title per sheet
\begin{document}
\begin{questions}
\miquestion[25]
Prove the following generalization of the master theorem. Given constants $a\geq 1,b> 1,d\geq 0$, and $w\geq 0$, if $T(n)=1$ for $n<b$ and $T(n) = aT(n/b) + n^d\log^w n$, we have
$$
T(n) = \begin{cases}
O(n^d\log^w n) & \mbox{if }a < b^d \\
O(n^{\log_b a}) & \mbox{if }a > b^d \\
O(n^d\log^{w+1} n) & \mbox{if }a = b^d
\end{cases}.
$$
SOLUTION. Assume that $f(n) = n^d\log^w n$. The running time of solving all size-1 problem is
$$
a^{\log_b n}\cdot O(1) = \Theta(n^{\log_ba})
$$
The total running time for combining is g(n),
$$
g(n) = \sum\limits_{i=0}^{\log_bn} a^{i}\cdot c \cdot (\frac{n}{b^{i}})^d \cdot (log n - ilog b)^w
$$
$$
= c \cdot n^d \cdot (log^w n + \frac{a}{b^d}(log n - log b)^w + \dots + (\frac{a}{b^d})log_b a\cdot (log n - log_b n \cdot log b)^w)
$$
Therefore, the total time is\quad $\Theta(n^{\log_b a}) + g(n)$.
Assume that $h(k) = (\frac{a}{b^d})^k \cdot (log n - klog b)^w$, then
$$
h^{'}(k) = (log n - k log b)^{w-1} (ln {\frac{a}{b^d}}(log n - k log b) + w log b)
$$
h(k) has minimum at $k = log_b n$.
\\Case 1:
\begin{align*}
a < b^d &\Rightarrow \frac{a}{b^d} < 1 \\
& \Rightarrow n^{\log_b a} < n^d log^w n < { \lim_{\epsilon \to 0} n^{d + \epsilon}},
\end{align*}
The first dominates the g(n).
$$
T(n) = \Theta(n^{\log_b a}) + O(n^d log^w n)
= O(n^d log^w n)
$$
Case 2:
\begin{align*}
a > b^d &\Rightarrow \frac{a}{b^d} > 1\\
& \Rightarrow f(n) < n^{log_n a}\\
&\Rightarrow f(n) = O(n^{log_b a - \epsilon})
\end{align*}
The last dominates the g(n). Therefore,
$$
g(n) = { \lim_{\epsilon \to 0}\sum\limits_{i=0}^{\log_b n} a^{i}\cdot c\cdot (\frac{n}{b^{i}})^{log_b a - \epsilon}}
= c \cdot n^{log_b a - \epsilon} \cdot { \lim_{\epsilon \to 0}\sum\limits_{i=0}^{\log_b n} (\frac{a}{b^{log_b a - \epsilon}})^{i}}
$$
$$
= O(n^{log_b a - \epsilon} \cdot { \lim_{\epsilon \to 0 }(1 - \frac{(a - \epsilon)^2}{\epsilon})}) = O(n^{log_b a})
$$
$$
\Rightarrow T(n) = O(n^{\log_b a})
$$
Case 3:
$$a = b^d$$
$$f(n) = n^{log_b a}\cdot log^w n$$
$$g(n) = c \sum\limits_{i=0}^{\log_b n} a^{i}\cdot (\frac{n}{b^{i}})^{log_b a}\cdot log^w (\frac{n}{b^i})$$
$$= c \cdot n^{log_b a} \cdot \sum\limits_{i=0}^{\log_b n} log^w (\frac{n}{b^i})$$
When $n \longrightarrow +\infty $,\quad$(log n -log_b n \cdot log b)^w \longrightarrow log^w n + O(log^w n)$. Therefore,
$$g(n) = c\cdot n^{log_b a}\cdot log_b n \cdot (log^w n + O(log^w n))$$
$$= O(n^{log_b a}\cdot log_b n \cdot log^w n)$$
$$= O(n^{log_b a}\cdot log^{w + 1} n ) $$
$$\Rightarrow T(n) = O(n^{\log_b a}) + O(n^{log_b a}\cdot log^{w + 1} n ) = O(n^{log_b a}\cdot log^{w + 1} n )\quad \blacksquare$$
\miquestion[25]
Recall the median-of-the-medians algorithm we learned in the lecture. It groups the numbers by $5$. What happens if we group them by $3,7,9, \dots$? Please analyze those different choices and discuss which one is the best.
SOLUTION. Assume that we group the number by (2k + 1), where k = 0, 1, ……, n. Then we have $\frac{n}{2K+1}$ groups, so (2k + 1) medians. So the median of the medians v is no greater than $\frac{n}{2(2k + 1)}$ medians, no less than $\frac{n}{2(2k + 1)}$ medians. And each median is no greater than (k + 1) integers, no less than (k + 1) integers. Thus, v is no greater than $\frac{1}{2}\frac{k+1}{2k+1} n$ integers, no less than $\frac{1}{2}\frac{k+1}{2k+1} n$ integers. So the problem is reduced to at last $(1 - \frac{1}{2}\frac{k+1}{2k+1})n$. We should note that $(1 - \frac{1}{2}\frac{k+1}{2k+1})$ is becoming smaller asymptotic to 3/4 as k goes higher. So the less the k is, the smaller size problem will become for each recursion.
\\Now we calculate the running time T(n) of this algorithm. First we need to find the median of each group, which cost O(n). Then we find the true median of the $\frac {n}{2k+ 1}$ medians, which cost $T(\frac{n}{2k + 1})$. Finally we recurse the subsets with the worst case in which the problem is reduced to $(1 - \frac{1}{2}\frac{k+1}{2k+1})n$. So the recursion time is $T((1 - \frac{1}{2}\frac{k+1}{2k+1})n)$.
\\Therefore the time is:
$$T(n) = T(\frac{n}{2k + 1}) + T((1 - \frac{1}{2}\frac{k+1}{2k+1})n) + O(n)$$
Assume that $T(n) \leq A \cdot n$, $O(n) = C \cdot n$. By induction, base step: T(1) = 1. Inductive step: Suppose $T(i) \leq A \cdot i$ for each $i = 1,\dots ,n-1$. Then
$$T(n) = T(\frac{n}{2k + 1}) + T((1 - \frac{1}{2}\frac{k+1}{2k+1})n) + C \cdot n$$
$$\Rightarrow T(n) \leq A \cdot \frac{1}{2k + 1}n + A \cdot \frac{3k + 1}{2(2k + 1)}n + C \cdot n$$
$$\Rightarrow T(n) \leq A \cdot \frac{3k + 3}{2(2k + 1)}n + C \cdot n$$
If $A \cdot \frac{3k + 3}{2(2k + 1)}n + C \cdot n \leq A \cdot n$ \quad$\Rightarrow C \leq A \cdot \frac{k-1}{2(2k + 1)}$, the assumption is correct. It's obvious that it holds for every $k \textgreater 1$. i.e. every odd number more than 3. 3 is invalid. However, for each group, the time of finding the median is going up as k grows. i.e.C will goes up with k. To sum up, the minimum number 5 is the best choice.
\miquestion[25]
Let $X$ and $Y$ be two sets of integers.
Write $X\succ Y$ if $x\geq y$ for all $x\in X$ and $y\in Y$.
Given a set of $m$ integers, design an $O(m\log(m/n))$ time algorithm that partition these $m$ integers to $k$ groups $X_1,\ldots,X_k$ such that $X_i\succ X_j$ for any $i>j$ and $|X_1|,\ldots,|X_k|\leq n$.
Notice that $k$ is not specified as an input; you can decide the number of the groups in the partition, as long as the partition satisfies the given conditions.
You need to show that your algorithm runs in $O(m\log(m/n))$ time.
Remark: We have not formally define the asymptotic notation for multi-variable functions in the class.
For $f$ and $g$ be functions that maps $\mathbb{R}_{>0}^k$ to $\mathbb{R}_{>0}$, we say $f(\mathbf{x})=O(g(\mathbf{x}))$ if there exist constants $M,C>0$ such that $f(\mathbf{x})\leq C\cdot g(\mathbf{x})$ for all $\mathbf{x}$ with $x_i\geq M$ for some $i$.
The most rigorously running time should be written as $O(m\cdot\max\{\log(m/n),1\})$, although it is commonly just written as $O(m\log(m/n))$ for this kind of scenarios.
SOLUTION. The algorithm is shown as follows: First, randomly find a pivot v. There we always pick the first element as pivot. Then partition: put all elements greater than v to the left of v, and all smaller elements to the right of v. Partition is done recursively on each side of v, with the end sign that the size of sub question is no more than n. Now we get k groups $X_1,\ldots,X_k$ such that $X_i\succ X_j$ for any $i>j$ and $|X_1|,\ldots,|X_k|\leq n$.
\\The time complexity: If the pivot chosen at the each step divides the array into roughly equal halves, then the recursion depth is $log (\frac{m}{n})$. And we need to iterate all the element in each layer, costing O(m). Therefore, the total time is
$O(m\log(m/n))$.
\miquestion[25]
Given an array $A[1,\ldots,n]$ of integers sorted in ascending order, design an algorithm to \textbf{decide} if there exists an index $i$ such that $A[i]=i$ for each of the following scenarios. Your algorithm only needs to decide the existence of $i$; you do not need to find it if it exists.
\begin{parts}
\part The $n$ integers are positive and distinct.
\part The $n$ integers are distinct.
\part The $n$ integers are positive.
\part The $n$ integers are positive and are less than or equal to $n$.
\part No further information is known for the $n$ integers.
\end{parts}
Prove the correctness of your algorithms. For each part, try to design the algorithm with running time as low as possible.
\miquestion
How long does it take you to finish the assignment (including thinking and discussion)?
Give a score (1,2,3,4,5) to the difficulty.
Do you have any collaborators?
Please write down their names here.
SOLUTION. It takes me at least 10h to do the homework and I find it difficult for me with an approximate score of 4 partly because of the unacquaintance of Latex utilization. I also find out that I'm not familiar with the calculation of time complexity. I discussed the first and the third question with Huang YuTong.
\end{questions}
\end{document}