master.info文件不断被截断
问题描述:
我在Red Hat Linux上托管MySQL Ver 14.14,Distrib 5.1.16的两台不同服务器上有一个master-to-master复制设置。当我重新启动其中一台服务器时,从站无法启动。当我执行/ var/lib/mysql目录的列表时,我注意到master.info文件被截断为零,导致MySQL认为复制没有设置。master.info文件不断被截断
这里是服务器1的my.cnf:
[client]
port = 3306
socket = /var/lib/mysql/mysql.sock
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[isamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M
[myisamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout
[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock
key_buffer = 16M
max_allowed_packet = 1M
table_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
myisam_sort_buffer_size = 8M
log-bin = mysql-bin
relay-log = mysqld-relay-bin
relay-log-index = mysqld-relay-bin.index
server-id = 101
binlog-format = STATEMENT
replicate-do-db = foo
replicate-do-db = bar
binlog-do-db = foo
binlog-do-db = bar
auto_increment_increment = 2
auto_increment_offset = 1
master-connect-retry = 2
sync_binlog = 1
log-error = mysqld.log
log-warnings = 2
wait_timeout = 31536000
expire_logs_days = 45
这里是服务器2 my.cnf中:
[client]
port = 3306
socket = /var/lib/mysql/mysql.sock
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[isamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M
[myisamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout
[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock
key_buffer = 16M
max_allowed_packet = 1M
table_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
myisam_sort_buffer_size = 8M
log-bin = mysql-bin
relay-log = mysqld-relay-bin
relay-log-index = mysqld-relay-bin.index
server-id = 102
binlog-format = STATEMENT
replicate-do-db = foo
replicate-do-db = bar
binlog-do-db = foo
binlog-do-db = bar
auto_increment_increment = 2
auto_increment_offset = 2
master-connect-retry = 2
sync_binlog = 1
log-error = mysqld.log
log-warnings = 2
wait_timeout = 31536000
expire_logs_days = 45
我的设置,像这样每个服务器上的奴隶:
STOP SLAVE ; RESET SLAVE ; CHANGE MASTER TO MASTER_HOST='other_sys', MASTER_USER='repl', MASTER_PASSWORD='super_secret_password', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=14048 ;
MASTER_LOG_FILE和MASTER_LOG_POS在上例中是任意的。这样做后,我得到了MySQL从状态如下:
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: other_sys
Master_User: repl
Master_Port: 3306
Connect_Retry: 2
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 14048
Relay_Log_File: mysqld-relay-bin.000002
Relay_Log_Pos: 251
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: foo,bar
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 14048
Relay_Log_Space: 407
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
我然后执行使用命令reboot
服务器的重新启动。重新启动后,MySQL从站有时会从停止的地方自动启动。其他时候,MySQL从服务器根本无法启动,我注意到show slave status \G
返回一个空集,并且/var/lib/mysql/master.info文件被截断为零。就好像Linux在重启时没有将文件缓冲区刷新到inode,所以从属信息不会被保存。
我在配置从站时错过了什么吗?
答
以防万一有人想知道我是如何解决这个:
的问题是Linux的文件缓冲。我执行了上述相同的步骤,除了在拨打reboot
之前我拨打sync
,并且在10次尝试中100%的时间内工作。没有sync
,在我的测试中10次失败了9次。我不知道为什么Linux在关机时没有同步文件缓存,但拨打sync
或等待一分钟后再发出reboot
已解决问题。
这是离奇的。 Unix上的shutdown脚本在35年前做了三次或四次“sync”。 – EJP
事实证明,这对我们的生产系统来说是一个更大的问题。事实证明,它可以在我们的Linux系统上执行任何和所有文件。我在这个网站上发布了这个问题,但它被移到了unix.stackexchange.com网站。也许有人会告诉我,我们的商店是否有一些配置可以打破这一点。 –