Data structure

Array

xiru2ly3rd88
9 min readApr 19, 2022

initialize

  • miss element will be 0:
int myArray[10] = { 1, 2 }; // initialize to 1,2,0,0,0...
  • initialize all element 0
int myArray[10] = { 0 }; // all elements 0
  • In C++, an empty initialization list will also initialize every element to 0. This is not allowed with C:
int myArray[10] = {}; // all elements 0 in C++

It ok:
int myPoints[][3] = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9} };

Err: array type has incomplete element type ‘int[]’
int myPoints[3][] = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9} };
int myPoints[][] = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9} };

ref:
array initialize all element

Matrix

2D matrix

  • rotate matrix:
  1. with extra space:
    列出規則 -> tmp[row][col] = src[XXX][XXX] -> copy des from src
  2. without extra space:

String

"Hello World" 字串來看 C 字串的組成:

  • 結束字元: ‘\0’
  • 字串長度: strlen 不包含’\0’
  • 宣告動態變數存字串, 長度要多加1(結束字元)
    ex:
    char string[] = "Hello World";

char *buf = (char*)malloc(sizeof(char)*(strlen(string)+1));

  • traverse string

while (*string) {

}

ref:

Tree

parent vs child

被指向者(pointed)為child,指向者(point to)為parent。

  • Root: 1 tree only 1 root, A->parent = NULL
    ex: A
  • left node: no child
    ex: J, G
  • internal node:至少有一個child的node,稱為internal node。
    ex: E, B
  • level:定義root的level為1,其餘node的level=parent的level+1。
  • height of node:某一node與其最長path上之descendant leaf node之間的edge數。
    ex: D_height = 2, E_height = 1
  • height of tree:樹的height即為root的height。 tree_height = 3
  • depth: node 與root間的edge數
    ex: D_depth = 1, G_depth = 2

ref:

Binary Tree

node只有2個child

Binary Tree
  • Full Binary Tree(Perfect Binary Tree):

條件:

  1. 所有internal node都有child滿的
  2. 所有leaf node的height相等

特性:

  1. left’s level = n, 總node數 2^n−1
  • Complete Binary Tree: node按照Full Binary Tree的次序排列
  1. left child必定位在index(2i)
  2. right child必定位在index(2i+1)
  3. parent必定位在index(⌊i/2⌋)
Complete Binary Tree
not Complete Binary Tree
  • Traverse: pre-, in-, post-是指parent node相對於child node的順序。
  1. Pre-order: parent先, 中->左->右, 4213657
  2. In-order: 按順序, 左->中->右, 1234567(得到排好的資料)
  3. Post-order: child先, 左->右->中, 1325764
Level order traverse
  1. Level-order: 照著「level由小到大」的順序, breath-first traversal,由上而下, ABCDEFGHI

ref:

  1. In-order by parent: 多parent變數 **以後再讀**
    Successor: 下一個
    Predecessor: 上一個
  • Insert Level Order: build Complete Binary tree(把洞補起來 )
原本的tree
insert node順序

insert node順序: 7->8->9->12

ref:

Binary Search Tree:

key relation:
left child < root
right child > root

  • Insert: 比較key -> find position, handle parent
  • Delete

handle 3 cases:
(1) no child: remove the node, if not root -> handle parent

no child

(2) 1 child: copy the child to the remove node -> remove the child

1 child

(3) 2 child: find inorder successor(60) -> copy the successor to the remove node -> delete the successor, if not root -> handle parent

2 child

ref:

AVL tree

  1. 屬於BST
  2. 平衡樹, 每個node 左邊子樹高-右邊子樹高 = -1,0,1(左右高度差1)
  • Insert

1. 照BST插入
2. 檢查是否符合高度:
Yes -> finish
NO -> 旋轉

4 cases:
1. LL -> R rotate

LL

2. RR -> L rotate

RR

3. LR -> L rotate -> R rotate

LR

4. RL -> R rotate -> L rotate

RL

ref:

  • Delete
  • Red Black Tree: ***還沒讀

多個color, 不一定balance

ref:

  • Red Black Tree vs AVL Tree:

RBT: 用color(bool), memory cost高, search慢(不一定balanced), insert/remove快(fewer rotation)

AVL: 用height(int), memory cost低, search快(balanced), insert/remove慢(more rotation)

ref:

LinkList

Revert **再複習
7->4->3->NULL
3->4->7->NULL
指向”後一個”改成”前一個” preNode
4的位置只有7知道, 7指向改變前需要一個變數先存起來 nextNode
所以共需要3個變數 preNode, curNode, nextNode
revertList( Node **list)

preNode = null;
curNode = *list;
nextNode = (*list)->next;
while (nextNode != null)
{
curNode->next = preNode;
preNode = curNode;
curNode = nextNode;
nextNode= nextNode->next;
}
curNode->next = preNode;
*list = curNode;

ref:
http://alrightchiu.github.io/SecondRound/linked-list-xin-zeng-zi-liao-shan-chu-zi-liao-fan-zhuan.html

Stack

符合Last in First out

push: 放入data

Pop: 拿出最上面data

Top: 回傳最上面data, 不影響儲存的data

Get size: 回傳有幾筆data, 不影響儲存的data

應用

Stack最主要的功能是「記得先前的資訊」,所以時常用來處理需要「回復到先前的狀態」的問題,也稱為Back-Tracking問題,例如:

ref:

Queue

First in First out用於先到先執行、需要排程(scheduling)的應用:

ref:

  • Double Link List:

優點:
1) A DLL can be traversed in both forward and backward direction.
2) The delete operation in DLL is more efficient if pointer to the node to be deleted is given.
3) We can quickly insert a new node before a given node.
In singly linked list, to delete a node, pointer to the previous node is needed. To get this previous node, sometimes the list is traversed. In DLL, we can get the previous node using previous pointer.

缺點:
1) Every node of DLL Require extra space for an previous pointer.
2) All operations require an extra pointer previous to be maintained.

插在後面,前面:handle preNode->next, nextNode->pre, curNode->pre, curNode->next

插在中間

ref:

  • Binary Heap:
  1. As Complete Binary Tree
    Index: starting index = 0
    parent((i-1)/2)
    left child(2i+1)
    right child(2i+2)
Complete Binary Tree

2. stored in a special order
a. Max Heap: parent > childrens
b. Min Heap: parent < childrens

ref:

  • Hashing:
    A technique or process of mapping keys, values into the hash table by using a hash function.
    It is done for faster access to elements.
    The efficiency of mapping depends on the efficiency of the hash function used.

Ex:
Storing employee records keyed using phone numbers:
操作:
1. Insert a phone number and corresponding information.

2. Search a phone number and fetch the information.

3. Delete a phone number and related information.

可以用下列data structure儲存:
1. Array of phone numbers and records.

2. Linked List of phone numbers and records.

3. Balanced binary search tree with phone numbers as keys.

4. Direct Access Table.
最快, 但是有幾個問題:
4.1. extra space required is huge:
ex: phone number is n digits, we need O(m * 10n) space for table where m is size of a pointer to record.

4.2. an integer in a programming language may not store n digits. ?? 不懂

  • Hashing is a solution, get O(1) search time on average (under reasonable assumptions) and O(n) in worst case.
  • Hash function: 把key map to hash table的index.
  1. Division method: 取餘數, table大小m要選質數, 避開2的指數
    ex: m = 8 = 2³, 只會用最低的3個bit
  2. Multiplication method: 用到更多bit

ref:
http://alrightchiu.github.io/SecondRound/hash-tableintrojian-jie.html

  • Collision Handling:

因為把big key -> small index, 所以有可能2個以上key -> 相同index, 稱為collision.

2個方法:
1. chaining: 用link list, 需要額外空間
優點: 簡單
缺點: 浪費空間(hash table沒填滿, 還需要額外的link list)

2. open addressing: all elements are stored in the hash table itself
key 已經有人用 -> 往下找沒人用的key, 稱為probing.

chaining vs open addressing
  • Probing method:
  1. Linear Probing:
Linear Probing

缺點: 只要「起點」決定,Probing Sequence就決定(i不斷加一), 容易造成這個區塊越來越擠(primary clustering)

2. Quadratic Probing:
多2次項, h(k,i)=(h′(k)+c1i+c2i^2)mod m,c2≠0
case 1. c1=c2=0.5,m=2^P -> h(k,i)=(h′(k)+0.5i+0.5i^2)mod m

case 1

case 2. h(k,i)=(h′(k)−(−1)^i⌈i/2⌉^2)mod m,
where m be prime and m=4n+3,for some n
**下圖怪怪的??

case 2

3. Double Hashing: 多用一個hash function

h(k,i)=(h1(k)+ih2(k))mod m, h2(k)一定要與m互質

Chaining vs Open addressing:

搜尋的時間時間複雜度

當load factor α接近1時(table快滿), Open Addressing會變得沒效率, 建議load factor α<0.5

ref:
https://alrightchiu.github.io/SecondRound/hash-tableopen-addressing.html

--

--

xiru2ly3rd88
xiru2ly3rd88

Written by xiru2ly3rd88

0 Followers

學習筆記不保證100%正確, 只是用來快速複習; 聯絡信箱: xiru2ly3rd88@gmail.com

No responses yet