Data structure
Array
initialize
- miss element will be 0:
int myArray[10] = { 1, 2 }; // initialize to 1,2,0,0,0...
- initialize all element 0
int myArray[10] = { 0 }; // all elements 0
- In C++, an empty initialization list will also initialize every element to 0. This is not allowed with C:
int myArray[10] = {}; // all elements 0 in C++
It ok:
int myPoints[][3] = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9} };
Err: array type has incomplete element type ‘int[]’
int myPoints[3][] = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9} };
int myPoints[][] = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9} };
ref:
array initialize all element
Matrix
2D matrix
- rotate matrix:
- with extra space:
列出規則 -> tmp[row][col] = src[XXX][XXX] -> copy des from src - without extra space:
String
"Hello World"
字串來看 C 字串的組成:
- 結束字元: ‘\0’
- 字串長度: strlen 不包含’\0’
- 宣告動態變數存字串, 長度要多加1(結束字元)
ex:char string[] = "Hello World";
char *buf = (char*)malloc(sizeof(char)*(strlen(string)+1));
- traverse string
while (*string) {
…
}
ref:
Tree
被指向者(pointed)為child,指向者(point to)為parent。
- Root: 1 tree only 1 root, A->parent = NULL
ex: A - left node: no child
ex: J, G - internal node:至少有一個child的node,稱為internal node。
ex: E, B - level:定義root的level為1,其餘node的level=parent的level+1。
- height of node:某一node與其最長path上之descendant leaf node之間的edge數。
ex: D_height = 2, E_height = 1 - height of tree:樹的height即為root的height。 tree_height = 3
- depth: node 與root間的edge數
ex: D_depth = 1, G_depth = 2
ref:
Binary Tree
node只有2個child
- Full Binary Tree(Perfect Binary Tree):
條件:
- 所有internal node都有child滿的
- 所有leaf node的height相等
特性:
- left’s level = n, 總node數 2^n−1
- Complete Binary Tree: node按照Full Binary Tree的次序排列
- 其left child必定位在index(2i);
- 其right child必定位在index(2i+1);
- 其parent必定位在index(⌊i/2⌋)。
- Traverse: pre-, in-, post-是指parent node相對於child node的順序。
- Pre-order: parent先, 中->左->右, 4213657
- In-order: 按順序, 左->中->右, 1234567(得到排好的資料)
- Post-order: child先, 左->右->中, 1325764
- Level-order: 照著「level由小到大」的順序, breath-first traversal,由上而下, ABCDEFGHI
ref:
- In-order by parent: 多parent變數 **以後再讀**
Successor: 下一個
Predecessor: 上一個
- Insert Level Order: build Complete Binary tree(把洞補起來 )
insert node順序: 7->8->9->12
ref:
Binary Search Tree:
key relation:
left child < root
right child > root
- Insert: 比較key -> find position, handle parent
- Delete
handle 3 cases:
(1) no child: remove the node, if not root -> handle parent
(2) 1 child: copy the child to the remove node -> remove the child
(3) 2 child: find inorder successor(60) -> copy the successor to the remove node -> delete the successor, if not root -> handle parent
ref:
AVL tree
- 屬於BST
- 平衡樹, 每個node 左邊子樹高-右邊子樹高 = -1,0,1(左右高度差1)
- Insert
1. 照BST插入
2. 檢查是否符合高度:
Yes -> finish
NO -> 旋轉
4 cases:
1. LL -> R rotate
2. RR -> L rotate
3. LR -> L rotate -> R rotate
4. RL -> R rotate -> L rotate
ref:
- Delete
- Red Black Tree: ***還沒讀
多個color, 不一定balance
ref:
- Red Black Tree vs AVL Tree:
RBT: 用color(bool), memory cost高, search慢(不一定balanced), insert/remove快(fewer rotation)
AVL: 用height(int), memory cost低, search快(balanced), insert/remove慢(more rotation)
ref:
LinkList
Revert **再複習
7->4->3->NULL
3->4->7->NULL
指向”後一個”改成”前一個” preNode
4的位置只有7知道, 7指向改變前需要一個變數先存起來 nextNode
所以共需要3個變數 preNode, curNode, nextNode
revertList( Node **list)
preNode = null;
curNode = *list;
nextNode = (*list)->next;
while (nextNode != null)
{
curNode->next = preNode;
preNode = curNode;
curNode = nextNode;
nextNode= nextNode->next;
}
curNode->next = preNode;
*list = curNode;
Stack
符合Last in First out
push: 放入data
Pop: 拿出最上面data
Top: 回傳最上面data, 不影響儲存的data
Get size: 回傳有幾筆data, 不影響儲存的data
應用
Stack最主要的功能是「記得先前的資訊」,所以時常用來處理需要「回復到先前的狀態」的問題,也稱為Back-Tracking問題,例如:
- 編輯器(如word、sublime等等)中的undo。
- 網頁瀏覽器的「回到前一頁」功能。
- 編譯器(compiler)中的Parsing。
- 參考:Compiler Design — Top-Down Parser。
- 任何遞迴(recursion)形式的演算法,都可以用Stack改寫,例如Depth-First Search(DFS,深度優先搜尋)。
- 參考:GeeksforGeeks:Iterative Depth First Traversal of Graph。
- 因為遞迴(recursion)使用了系統的Call Stack。
- 於記憶體管理(memory management)中的Call Stack。
- 參考:Wikipedia:Call stack。
- 參考:Stack Overflow:What and where are the stack and heap?。
ref:
Queue
First in First out用於先到先執行、需要排程(scheduling)的應用:
- 演算法:Breadth-First Search(廣度優先搜尋)與Tree的Level-Order Traversal會用到Queue。
- 作業系統:被多個程式共享的資源(例如CPU、印表機、網站伺服器),一次只能執行一個需求(例如request、interrupt),因此需要有個Queue來安排多個程式的執行順序(例如device queue、job queue),請參考:
- Tutorialspoint:Operating System — Process Scheduling。
- Stack Overflow:What are practical applications of Queues?。
- Stack Exchange:Which queue does the long-term scheduler maintain?
ref:
- Double Link List:
優點:
1) A DLL can be traversed in both forward and backward direction.
2) The delete operation in DLL is more efficient if pointer to the node to be deleted is given.
3) We can quickly insert a new node before a given node.
In singly linked list, to delete a node, pointer to the previous node is needed. To get this previous node, sometimes the list is traversed. In DLL, we can get the previous node using previous pointer.
缺點:
1) Every node of DLL Require extra space for an previous pointer.
2) All operations require an extra pointer previous to be maintained.
插在後面,前面:handle preNode->next, nextNode->pre, curNode->pre, curNode->next
ref:
- Binary Heap:
- As Complete Binary Tree
Index: starting index = 0
parent((i-1)/2)
left child(2i+1)
right child(2i+2)
2. stored in a special order
a. Max Heap: parent > childrens
b. Min Heap: parent < childrens
ref:
- Hashing:
A technique or process of mapping keys, values into the hash table by using a hash function.
It is done for faster access to elements.
The efficiency of mapping depends on the efficiency of the hash function used.
Ex:
Storing employee records keyed using phone numbers:
操作:
1. Insert a phone number and corresponding information.
2. Search a phone number and fetch the information.
3. Delete a phone number and related information.
可以用下列data structure儲存:
1. Array of phone numbers and records.
2. Linked List of phone numbers and records.
3. Balanced binary search tree with phone numbers as keys.
4. Direct Access Table.
最快, 但是有幾個問題:
4.1. extra space required is huge:
ex: phone number is n digits, we need O(m * 10n) space for table where m is size of a pointer to record.
4.2. an integer in a programming language may not store n digits. ?? 不懂
- Hashing is a solution, get O(1) search time on average (under reasonable assumptions) and O(n) in worst case.
- Hash function: 把key map to hash table的index.
- Division method: 取餘數, table大小m要選質數, 避開2的指數
ex: m = 8 = 2³, 只會用最低的3個bit - Multiplication method: 用到更多bit
ref:
http://alrightchiu.github.io/SecondRound/hash-tableintrojian-jie.html
- Collision Handling:
因為把big key -> small index, 所以有可能2個以上key -> 相同index, 稱為collision.
2個方法:
1. chaining: 用link list, 需要額外空間
優點: 簡單
缺點: 浪費空間(hash table沒填滿, 還需要額外的link list)
2. open addressing: all elements are stored in the hash table itself
key 已經有人用 -> 往下找沒人用的key, 稱為probing.
- Probing method:
- Linear Probing:
缺點: 只要「起點」決定,Probing Sequence就決定(i不斷加一), 容易造成這個區塊越來越擠(primary clustering)
2. Quadratic Probing:
多2次項, h(k,i)=(h′(k)+c1i+c2i^2)mod m,c2≠0
case 1. c1=c2=0.5,m=2^P -> h(k,i)=(h′(k)+0.5i+0.5i^2)mod m
case 2. h(k,i)=(h′(k)−(−1)^i⌈i/2⌉^2)mod m,
where m be prime and m=4n+3,for some n
**下圖怪怪的??
3. Double Hashing: 多用一個hash function
h(k,i)=(h1(k)+ih2(k))mod m, h2(k)一定要與m互質
Chaining vs Open addressing:
當load factor α接近1時(table快滿), Open Addressing會變得沒效率, 建議load factor α<0.5
ref:
https://alrightchiu.github.io/SecondRound/hash-tableopen-addressing.html