Neo4j internal(data storage)

Neo4j底层存储

Neo4j是一款优秀的图数据库╰( ̄▽ ̄)╭。

以下所述的内容介绍了Neo4j底层存储原理、数据格式等等。包括:NodeRecordFormat、RelationshipRecordFormat,PropertyRecordFormat…。

叙述整体分两大部分:

  • 前篇主要叙述存储格式,字节顺序等等。
  • 后篇主要通过样例并结合上篇的存储格式,介绍Neo4j文件里具体的存储内容。

约定

  • 分析基于Neo4j-community-3.4.10
  • Rel <=> Relationship
  • Prop <=> Property

一些知识:

  • 数据结构

    • 单向链表
    • 双向链表
  • 存储容量

    • 34B nodes
    • 34B relationships
    • 68B properties

存储容量可通过Neo4j官网提供的操作手册查询。

一些数字

216=65,5362^{16} = 65,536
232=4,294,967,2962^{32} = 4,294,967,296
235=34,359,738,3682^{35} = 34,359,738,368
236=68,719,476,7362^{36} = 68,719,476,736


前篇

本部分主要叙述存储格式,字节顺序等等。

开始以下的内容前,先看一看Neo4j的整体存储架构:

从上边的图可以看出Neo4j的底层存储用到的数据结构是链表。

Ids

Neo4j里Node、Relationship、Property、Label、RelationshipType Id,在磁盘上所占位数大小。

1
2
3
4
5
6
7
    InUse: 1  bit;  
NodeId: 35 bits;
LabelId: 32 bits;
RelId: 35 bits;
RelTypeId: 16 bits;
PropId: 36 bits;
PropKeyId: 24 bits;

结合以上的信息,不难理解官网文档所说的 34B nodes, 34B relationships, 68B properties是怎么一回事。

NodeRecord

NodeRecord(15 Bytes) :120 bits

1
2
3
4
5
6
7
8
9
 4 bits: High PropId
3 bits: High RelId
1 bit : InUse
32 bits: Low RelId
32 bits: Low PropId
32 bits: Low Label // lsb+msb
8 bits: High Label // lsb+msb
7 bits: extra // Unused
1 bit : isDense

RelationshipRecord

RelationshipRecord(34 Bytes) :272 bits

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
 4 bits: High PropId
3 bits: High StartNodeId
1 bit : InUse
32 bits: Low StartNodeId
32 bits: Low EndNodeId
1 bit : // Unused
3 bits: High EndNodeId
3 bits: High StartNodePrevRelId
3 bits: High StartNodeNextRelId
3 bits: High EndNodePrevRelId
3 bits: High EndNodeNextRelId
16 bits: RelationshipType
32 bits: Low StartNodePrevRelId
32 bits: Low StartNodeNextRelId
32 bits: Low EndNodePrevRelId
32 bits: Low EndNodeNextRelId
32 bits: Low PropId
6 bits: // Unused
1 bit : firstInEndNodeChain
1 bit : firstInStartNodeChain

PropertyRecord

PropertyRecord(41 Bytes) :328 bits

1
2
3
4
5
  4 bits: High PrevPropId
4 bits: High NextPropId
32 bits: Low PrevPropId
32 bits: Low NextPropId
256 bits: PropBlock

LabelRecord

LabrlRecord(5 Bytes) :40 bits

1
2
3
 7 bits: // Unused
1 bit : InUse
32 bits: Label

RelationshipTypeRecord

RelationshipTypeRecord(5 Bytes) :40 bits

1
2
3
4
 7 bits: // Unused
1 bit : InUse
16 bits: // Unused
16 bits: RelationshipType

PropertyKeyRecord

PropertyKeyRecord(9 Bytes) :72 bits

1
2
3
4
 7 bits: // Unused
1 bit : InUse
32 bits: PropCount
32 bits: NameId

DynamicRecord

动态存储在存储属性值时特别用到,这种设计很巧妙。

DynamicRecord(8+ Bytes) :64+ bits

1
2
3
4
5
6
7
8
 1 bit : 0: start record, 1: linked record
2 bits: // Unused
1 bit : InUse
4 bits: High NextBlockId
24 bits: nr of bytes in the data field in this record
32 bits: Low NextBlockId

n Bytes: data //data

RelationshipGroupRecord

One record holds first relationship links (out,in,loop) to relationships for one type for one entity.

RelationshipGroupRecord(25 Bytes) :200 bits

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
 1 bit : // Unused
3 bits: High FirstOutRelId
3 bits: High NextRelId
1 bit : InUse
1 bit : // Unused
3 bits: High FirstLoopRelId
3 bits: High firstInRelId
1 bit : // Unused
16 bits: type
32 bits: Low NextRelId
32 bits: Low FirstOutRelId
32 bits: Low FirstInRelId
32 bits: Low FirstLoopRelId
32 bits: Low OwningNodeId
5 bits: 00000 // Unused
3 bits: High OwningNodeId

后篇

此通过样例数据并结合上篇的存储格式,介绍Neo4j文件里具体的存储内容。

现在假设有以下的关系图,

现在要将其存储在Neo4j里。(样例数据)

那么在Neo4j文件里存储的具体内容是什么样子呢?


数据文件

样例数据 提取码:9u4u


一些类

  • org.neo4j.kernel.impl.store.StoreType
    从这个类可以看出Neo4j存储类型都有哪些。比如:Node、Properties、Relationship等等…

  • org.neo4j.kernel.impl.store.NeoStores

    This class contains the references to the “NodeStore,RelationshipStore,PropertyStore and RelationshipTypeStore”.


neostore

元数据信息,记录创建时间、存储版本、最后一次升级时间…等等。

相关类:

  • org.neo4j.kernel.impl.store.MetaDataStore
  • org.neo4j.kernel.impl.store.record.MetaDataRecord
  • org.neo4j.kernel.impl.store.format.standard.MetaDataRecordFormat
  • org.neo4j.kernel.impl.store.format.StoreVersion

9 longs in header (long + in use)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
 0: Creation time;
1: Random number for store id;
2: Current log version;
3: Last committed transaction;
4: Store format version;
5: First property record containing graph properties;
6: Last committed transaction containing constraint changes;
7: Transaction id most recent upgrade was performed at;
8: Time of last upgrade;
9: Checksum of last committed transaction;
10: Checksum of transaction id the most recent upgrade was performed at;
11: Log version where the last transaction commit entry has been written into;
12: Byte offset in the log file where the last transaction commit entry has been written into;
13: Commit time timestamp for last committed transaction;
14: Commit timestamp of transaction the most recent upgrade was performed at.

示例数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
No.:   data  
0: 01 00 00 01 6A E7 BD 33 5F
// InUse: 1, time:0x16AE7BD335F->1558666097503->2019-05-24 10:48:17;
1: 01 52 9E CB 97 A4 74 8F 5E
2: 01 00 00 00 00 00 00 00 00
3: 01 00 00 00 00 00 00 00 01
4: 01 00 39 2E 41 2E 30 76 06
// InUse: 1, store version: 9.A.0v
5: 01 FF FF FF FF FF FF FF FF
6: 01 00 00 00 00 00 00 00 00
7: 01 00 00 00 00 00 00 00 01
8: 01 00 00 01 6A E7 BD 33 5F
9: 01 00 00 00 00 00 00 00 00
10: 01 00 00 00 00 00 00 00 00
11: 01 00 00 00 00 00 00 00 00
12: 01 00 00 00 00 00 00 00 10
13: 01 00 00 00 00 00 00 00 00
14: 01 00 00 00 00 00 00 00 00

neostore.counts.db.a/b

节点、关系计数…等。

相关类:

  • org.neo4j.kernel.impl.store.counts.KeyFormat
  • org.neo4j.kernel.impl.store.counts.keys.CountsKeyType
  • org.neo4j.kernel.impl.store.counts.CountsTracker
  • org.neo4j.kernel.impl.store.counts.CountsUpdater

Node count,Relationship count,index…

The counts store is a key/value store.
Node count:

1
2
3
4
5
6
7
8
9
10
Key format:
0 1 2 3 4 5 6 7 8 9 A B C D E F
[t,0,0,0,0,0,0,0 ; 0,0,0,0,l,l,l,l]
t - entry type - "{@link #NODE_COUNT}"
l - label id

Value format:
0 1 2 3 4 5 6 7 8 9 A B C D E F
[0,0,0,0,0,0,0,0 ; c,c,c,c,c,c,c,c]
c - number of matching nodes

Relationship count:

1
2
3
4
5
6
7
8
9
10
11
12
Key format:
0 1 2 3 4 5 6 7 8 9 A B C D E F
[t,0,0,0,s,s,s,s ; r,r,r,r,e,e,e,e]
t - entry type - "{@link #}"
s - start label id
r - relationship type id
e - end label id

Value format:
0 1 2 3 4 5 6 7 8 9 A B C D E F
[0,0,0,0,0,0,0,0 ; c,c,c,c,c,c,c,c]
c - number of matching relationships

示例数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
No.:   data
0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
1: 4E 65 6F 43 6F 75 6E 74 53 74 6F 72 65 00 02 56
// NeoCountStore02V

2: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
3: 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 01

4: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
// NodeCount, LabelId: 0;
5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 06
// count: 6;

6: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// NodeCount, LabelId: 1;
7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02
// count: 2;

8: 01 00 00 00 00 00 00 00 00 00 00 00 FF FF FF FF
// NodeCount, LabelId: N;
9: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08
// count: 8;

10: 02 00 00 00 00 00 00 00 00 00 00 00 FF FF FF FF
// RelationshipCount, StartLabelId: 0, RelationshipTypeId: 0, EndLabelId: N;
11: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

12: 02 00 00 00 00 00 00 00 00 00 00 01 FF FF FF FF
// RelationshipCount, StartLabelId: 0, RelationshipTypeId: 1, EndLabelId: N;
13: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

14: 02 00 00 00 00 00 00 00 00 00 00 02 FF FF FF FF
// RelationshipCount, StartLabelId: 0, RelationshipTypeId: 2, EndLabelId: N;
15: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

16: 02 00 00 00 00 00 00 00 00 00 00 03 FF FF FF FF
// RelationshipCount, StartLabelId: 0, RelationshipTypeId: 3, EndLabelId: N;
17: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

18: 02 00 00 00 00 00 00 00 00 00 00 04 FF FF FF FF
// RelationshipCount, StartLabelId: 0, RelationshipTypeId: 4, EndLabelId: N;
19: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02
// count: 2;

20: 02 00 00 00 00 00 00 00 00 00 00 05 FF FF FF FF
// RelationshipCount, StartLabelId: 0, RelationshipTypeId: 5, EndLabelId: N;
21: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

22: 02 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF
// RelationshipCount, StartLabelId: 0, RelationshipTypeId: N, EndLabelId: N;
23: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 07
// count: 7;

24: 02 00 00 00 FF FF FF FF 00 00 00 00 00 00 00 00
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 0, EndLabelId: 0;
25: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

26: 02 00 00 00 FF FF FF FF 00 00 00 00 FF FF FF FF
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 0, EndLabelId: N;
27: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

28: 02 00 00 00 FF FF FF FF 00 00 00 01 00 00 00 00
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 1, EndLabelId: 0;
29: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

30: 02 00 00 00 FF FF FF FF 00 00 00 01 FF FF FF FF
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 1, EndLabelId: N;
31: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

32: 02 00 00 00 FF FF FF FF 00 00 00 02 00 00 00 00
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 2, EndLabelId: 0;
33: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

34: 02 00 00 00 FF FF FF FF 00 00 00 02 FF FF FF FF
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 2, EndLabelId: N;
35: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

36: 02 00 00 00 FF FF FF FF 00 00 00 03 00 00 00 00
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 3, EndLabelId: 0;
37: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

38: 02 00 00 00 FF FF FF FF 00 00 00 03 FF FF FF FF
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 3, EndLabelId: N;
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

41: 02 00 00 00 FF FF FF FF 00 00 00 04 00 00 00 01
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 4, EndLabelId: 1;
42: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02
// count: 2;

43: 02 00 00 00 FF FF FF FF 00 00 00 04 FF FF FF FF
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 4, EndLabelId: N;
44: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02
// count: 2;

45: 02 00 00 00 FF FF FF FF 00 00 00 05 00 00 00 00
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 5, EndLabelId: 0;
46: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

47: 02 00 00 00 FF FF FF FF 00 00 00 05 FF FF FF FF
// RelationshipCount, StartLabelId: N, RelationshipTypeId: 5, EndLabelId: N;
48: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
// count: 1;

49: 02 00 00 00 FF FF FF FF FF FF FF FF 00 00 00 00
// RelationshipCount, StartLabelId: N, RelationshipTypeId: N, EndLabelId: 0;
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 05
// count: 5;

51: 02 00 00 00 FF FF FF FF FF FF FF FF 00 00 00 01
// RelationshipCount, StartLabelId: N, RelationshipTypeId: N, EndLabelId: 1;
52: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02
// count: 2;

53: 02 00 00 00 FF FF FF FF FF FF FF FF FF FF FF FF
// RelationshipCount, StartLabelId: N, RelationshipTypeId: N, EndLabelId: N;
54: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 07
// count: 7;

55: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
56: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 19

从上边的数据可以看出:
1.数据是K, V格式存储的。
2.Node计数是分Label计的。
3.Relationship计数是按StartNodeLabel、RelationshipType、EndNodeLabel排列组合计的。

那么,为什么要把这个数记下来呢?
从Neo4j存储架构来看,这么存是非常有必要的。

举个栗子:如果不存count要统计某个指定label的数据量,可能会经历以下过程:

首先,需要定位到labelId;定位完labelId后,再扫node表,找出所有labelId与指定label的Id相同且inUse为true的数据并计数。

可以预想,这种过程是十分繁琐且效率底下的。而且处理过程中还要考略很多因素,比如删除数据等等。

所以把count存储起来是十分美妙的。

那么可能会想到,数据是变动的,也就是说count计数一直再变,怎么更新呢?
这个问题就是count分a,b两个文件的绝妙之处了,Neo4j会在a,b文件之间切换记录,保证count计数的准确。

These store files are immutable, and on store-flush the implementation swaps the read and write.


neostore.labeltokenstore.db

记录 Label id。

相关类:

  • org.neo4j.kernel.impl.store.format.standard.LabelTokenRecordFormat

示例数据

1
2
3
4
5
No.:   data
0: 01 00 00 00 01
// InUse:1, nameId:1;
1: 01 00 00 00 02
// InUse:1, nameId:2

neostore.labeltokenstore.db.names

Label id 对应的label值。

相关类:

  • org.neo4j.kernel.impl.store.LabelTokenStore
  • org.neo4j.kernel.impl.store.TokenStore
  • org.neo4j.kernel.impl.store.record.LabelTokenRecord
  • org.neo4j.kernel.impl.store.record.DynamicRecord

DynamicRecord(8+ Bytes) :64+ bits

1
2
3
4
5
6
7
8
 1 bit : 0: start record, 1: linked record
2 bits: // Unused
1 bit : InUse
4 bits: High NextBlockId
24 bits: nr of bytes in the data field in this record
32 bits: Low NextBlockId

n Bytes: data //data

示例数据

1
2
3
4
5
6
7
8
9
No.:   data
0: 00 00 00 26 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
// the store header. step: 0x26=38,每条记录定长38字节,不足38字节,填0.

1: 10 00 00 06 FF FF FF FF 50 65 72 73 6F 6E 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//(DynamicRecord): start record, inUse:1, nr of bytes:12, nextBolckId:N, data:Person;

2: 10 00 00 0C FF FF FF FF 4F 72 67 61 6E 69 7A 61 74 69 6F 6E 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//(DynamicRecord): start record, inUse:1, nr of bytes:12, nextBolckId:N, data:Organization;

neostore.nodestore.db

存储Node。

相关类:

  • org.neo4j.kernel.impl.store.format.standard.NodeRecordFormat
1
2
3
4
5
6
7
8
9
 4 bits: High PropId
3 bits: High RelId
1 bit : InUse
32 bits: Low RelId
32 bits: Low PropId
32 bits: Low Label // lsb
8 bits: High Label // msb
7 bits: extra // Unused
1 bit : isDense

示例数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
No.:   data
0: 01 (00 00 00 04) (00 00 00 00) (00 00 00 00 10) 00
// InUse: true, relationship: 4, prop:0, label:0;

1: 01 (00 00 00 03) (00 00 00 01) (00 00 00 00 10) 00
// InUse: true, relationship: 3, prop:1, label:0;

2: 01 00 00 00 05 00 00 00 02 00 00 00 00 10 00
3: 01 00 00 00 06 00 00 00 03 00 00 00 00 10 00
4: 01 00 00 00 06 00 00 00 04 00 00 00 00 10 00
5: 01 FF FF FF FF 00 00 00 05 00 00 00 00 10 00
...: ...
546: 01 00 00 00 05 00 00 00 C7 00 00 00 01 10 00
// InUse: true, relationship: N, prop:199, label:1;

547: 01 FF FF FF FF FF FF FF FF 00 00 00 01 10 00
// InUse: true, relationship: N, prop:N, label:1;

neostore.propertystore.db

存储Property。

PropertyRecord is a container for PropertyBlocks. PropertyRecords form
a double linked list and each one holds one or more PropertyBlocks that
are the actual property key/value pairs. Because PropertyBlocks are of
variable length, a full PropertyRecord can be holding just one
PropertyBlock.

相关类:

  • org.neo4j.kernel.impl.store.format.standard.PropertyRecordFormat
  • org.neo4j.kernel.impl.store.record.PropertyRecord
  • org.neo4j.kernel.impl.store.PropertyStore

PropertyRecord(41 Bytes) :328 bits

1
2
3
4
5
6
  4 bits: High PrevPropId
4 bits: High NextPropId
32 bits: Low PrevPropId
32 bits: Low NextPropId
256 bits: PropBlock

示例数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
No.:   data
0: 00 (FF FF FF FF) (FF FF FF FF) B3 AD 26 0A 7B 00 00 00 00 00 00 00 00 00 00 14 00 00 00 01 55 00 00 01 00 00 00 00 00 00 00 00
// prev: N, next: N;

1: 00 FF FF FF FF FF FF FF FF DD 2D 01 8C 7B 00 00 00 00 00 00 00 00 00 06 42 00 00 00 01 25 00 00 01 00 00 00 00 00 00 00 00
2: 00 FF FF FF FF FF FF FF FF DD 0D 0D 0E 7B 00 00 00 00 00 00 00 00 01 74 29 00 00 00 01 55 00 00 01 00 00 00 00 00 00 00 00
3: 00 FF FF FF FF FF FF FF FF CF 74 2B 8C 7B 00 00 00 00 00 00 00 00 00 06 AB 00 00 00 01 65 00 00 01 00 00 00 00 00 00 00 00
4: 00 FF FF FF FF FF FF FF FF 5F 0D 0D 0E 7B 00 00 00 00 00 00 00 00 01 AD 26 00 00 00 01 75 00 00 01 00 00 00 00 00 00 00 00
5: 00 FF FF FF FF FF FF FF FF A3 76 A9 8A 7B 00 00 00 00 00 00 00 00 00 00 14 00 00 00 02 65 00 00 01 00 00 00 00 00 00 00 00
...:...
199: 00 FF FF FF FF FF FF FF FF 52 9D 26 1E 7B 00 00 00 CE 94 E7 94 BB 53 72 A0 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
//1FDF: 199*41, prev: N, next: N;
...:...
398: 00 FF FF FF FF FF FF FF FF 00 90 01 10 1B 00 00 02 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//3FBE: 398*41, prev: N, next: N;

neostore.propertystore.db.index

存储property 的key id。

相关类:

  • org.neo4j.kernel.impl.store.format.standard.PropertyKeyTokenRecordFormat
  • org.neo4j.kernel.impl.store.format.standard.TokenRecordFormat
  • org.neo4j.kernel.impl.store.record.PropertyKeyTokenRecord

示例数据

1
2
3
4
No.:   data
0: 01 00 00 00 00 00 00 00 01
1: 01 00 00 00 00 00 00 00 02
2: 01 00 00 00 00 00 00 00 03

neostore.propertystore.db.index.keys

存储property 的key 的值。

相关类:

  • org.neo4j.kernel.impl.store.format.standard.DynamicRecordFormat

示例数据

1
2
3
4
5
6
7
8
9
10
11
12
13
No.:   data
0: 00 00 00 26 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
// the store header. step: 0x26=38,每条记录定长38字节,不足38字节,填0.

1: 10 00 00 04 FF FF FF FF 6E 61 6D 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//(DynamicRecord): start record, inUse:1, nr of bytes:4, nextBolckId:N, data:name;

2: 10 00 00 03 FF FF FF FF 61 67 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//(DynamicRecord): start record, inUse:1, nr of bytes:3, nextBolckId:N, data:age;

3: 10 00 00 04 FF FF FF FF 64 61 74 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//(DynamicRecord): start record, inUse:1, nr of bytes:4, nextBolckId:N, data:date;


neostore.relationshipstore.db

存储Relationship数据。

相关类:

  • org.neo4j.kernel.impl.store.format.standard.RelationshipRecordFormat

RelationshipRecord(34 Bytes) :272 bits

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
 4 bits: High PropId
3 bits: High StartNodeId
1 bit : InUse
32 bits: Low StartNodeId
32 bits: Low EndNodeId
1 bit : // Unused
3 bits: High EndNodeId
3 bits: High StartNodePrevRelId
3 bits: High StartNodeNextRelId
3 bits: High EndNodePrevRelId
3 bits: High EndNodeNextRelId
16 bits: RelationshipType
32 bits: Low StartNodePrevRelId
32 bits: Low StartNodeNextRelId
32 bits: Low EndNodePrevRelId
32 bits: Low EndNodeNextRelId
32 bits: Low PropId
6 bits: // Unused
1 bit : firstInEndNodeChain
1 bit : firstInStartNodeChain

示例数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
No.:   data
0: 01 (00 00 00 01) (00 00 00 00) 00 00 (00 00) (00 00 00 01) (FF FF FF FF) (00 00 00 01) (FF FF FF FF) (00 00 01 8E) 00
// inUse: 1
// startNodeId: 1
// endNodeId: 0
// relationshipType: 0x0000
// startNodePrevRelId: 1
// startNodeNextRelId: N
// endNodePrevRelId: 1
// endNodeNextRelId: N
// PropId: 398

1: 01 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 02 00 00 00 00 00 00 00 03 00 00 00 00 FF FF FF FF 00
2: 01 00 00 00 02 00 00 00 00 00 00 00 02 00 00 00 03 FF FF FF FF 00 00 00 04 00 00 00 01 FF FF FF FF 00
3: 01 00 00 00 01 00 00 00 02 00 00 00 03 00 00 00 03 00 00 00 01 00 00 00 05 00 00 00 02 FF FF FF FF 01
4: 01 00 00 00 00 00 00 02 22 00 00 00 04 00 00 00 04 00 00 00 02 00 00 00 05 FF FF FF FF FF FF FF FF 01
5: 01 00 00 00 02 00 00 02 22 00 00 00 04 00 00 00 03 00 00 00 03 00 00 00 02 00 00 00 04 FF FF FF FF 03
6: 01 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00 01 FF FF FF FF 00 00 00 01 FF FF FF FF FF FF FF FF 03


neostore.relationshiptypestore.db

存储RelationshipType 的 Id 数据。

相关类:

  • org.neo4j.kernel.impl.store.format.standard.RelationshipTypeTokenRecordFormat

示例数据

1
2
3
4
5
6
7
8
9
No.:   data
0: 01 00 00 00 01
// inUse: 1, relationshipType Id: 1;

1: 01 00 00 00 02
2: 01 00 00 00 03
3: 01 00 00 00 04
4: 01 00 00 00 05
5: 01 00 00 00 06

neostore.relationshiptypestore.db.names

存储RelationshipType数据。

相关类:

  • org.neo4j.kernel.impl.store.format.standard.DynamicRecordFormat

示例数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
No.:   data
0: 00 00 00 26 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
// the store header. step: 0x26=38,每条记录定长38字节,不足38字节,填0.

1: 10 00 00 04 FF FF FF FF 77 69 66 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//(DynamicRecord): start record, inUse:1, nr of bytes:4, nextBolckId:N, data:wife;

2: 10 00 00 09 FF FF FF FF 63 6F 6C 6C 65 61 67 75 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//(DynamicRecord): start record, inUse:1, nr of bytes:9, nextBolckId:N, data:colleague;

3: 10 00 00 09 FF FF FF FF 63 6C 61 73 73 6D 61 74 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
4: 10 00 00 07 FF FF FF FF 62 72 6F 74 68 65 72 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
5: 10 00 00 05 FF FF FF FF 65 6E 74 65 72 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
6: 10 00 00 06 FF FF FF FF 66 72 69 65 6E 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


结语

固定长度的字节存储,以及数值id,在查询时,根据偏移量可极其容易的定位数据。这种设计非常巧妙。

动态存储在存储属性的值时特别用到,第一眼看起来是不像固定长度的存储那样很容易的定位,但其不是无规律可循,设计很是巧妙。

Neo4j底层数据存储没有用到非常复杂的数据结构,比较频繁的就是双向链表的应用。
无论是通过node还是relationship 的 id,都可找到其相关的node,relationship或property等。这种设计既通俗易懂,又环环相扣,大有“一生二,二生三,三生万物”之气慨。

😃 “网”上得来终觉浅,绝知此事要躬行!希望此篇文章能对 Neo4j 底层原理感兴趣的童鞋有帮助。

总之,Neo4j是一款优秀的图数据库,引用官网文档的一句话来说:Neo4j止于您的想象!

Neo4j’s application is only limited by your imagination.


参考