×

Out of memory: Killed process XXX,一次Mysql服务不断重启排查,原因竟然是它 服务器报错Out of memory: Kill process (mysqld)内存

hqy hqy 发表于2024-11-29 17:15:34 浏览48 评论0

抢沙发发表评论

Out of memory: Killed process XXX



某用户1G内存的小鸡Linux系统(centos7)总是运行几小时就陷入假死状态,同时伴随CPU高负载现象。远程登录SSH无反应,宝塔面板也打不开。


本地调出窗口,发现提示:
Out of memory: Killed process 5372 (mysqld) score 130 or sacrifice child
Killed process 5372 (mysqld), UID 0, total-vm:2539052kB, anon-rss:2117096kB, file-rss:0kB, shmem-rss:0kB


这表明物理内存严重不足,导致占用内存最大的MYSQL被KILL了。其它程序也因为内存不足陷入了假死状态。


这种情况下,如果没办法减少内存占用,明智的办法只能升级内存了。


因为Out of memory 状态下,linux有一个保护机制:OOM(Out of memory) Killer,用于避免在[内存]不足的时候不至于出现严重问题,把一些无关的高占用的进程优先杀掉,即在内存严重不足时,系统为了继续运转,内核会挑选一个进程,将其杀掉,以释放内存,缓解内存不足情况,









前段时间,有测试人员联系我,需求帮助,让我帮忙看看,测试环境的一台mysql数据库不断的重启,导致他们的测试无法进行,时间拖的长了,会影响上线进度。


问题描述

在一个测试环境,mysql5.7部署在centos7.4的系统上,测试人员用LoadRunner进行应用的一个压测测试,压测并发为128,启动LoadRunner过几秒,LoadRunner日志就报错,显示mysql服务断开了,过几秒又恢复,然后又断开,异常诡异。


排查定位

于是就登录到服务器,看一下数据库服务器的内存,内存已经用完了,紧接着就看centos的系统日志(/var/log/messages),在这个日志里发现mysql服务因为OOM,导致进程被系统给kill掉了,然后因为mysql服务有守护进程存在,又被自动启动了。


一个经验丰富的mysql数据库运维人员,应该很快就知道什么原因导致OOM了。第一:mysql数据库的buffer pool内存参数配置不合理第二:mysql数据库的session初始化内存参数配置不合理第三:数据库连接总数配置不合理


测试环境的数据库服务器内存为4G,在这里就不谈,为什么测试人员要在这个服务器上做压测,其实做应用压测,应该要用准用的压测环境,而不是随便找个环境就压测,不展开说了。


模拟复盘

测试环境mysql5.7,centos7.7,内存2G

为了更好的模拟复盘上述故障,这里关闭掉模拟环境的swap,关闭步骤如下所示


    [root@localhost data]# free -m
                 total        used        free      shared  buff/cache   available
    Mem:           2124         565         640           9         917        1398
    Swap:          2047           0        2047

    [root@localhost data]# swapon -s
    文件名                          类型            大小    已用    权限
    /dev/dm-1                               partition       2097148 264     -2

    [root@localhost data]# swapoff dev/dm-1
    [root@localhost data]# free -m
                 total        used        free      shared  buff/cache   available
    Mem:           2124         254         944           9         925        1709
    Swap:             0           0           0


    从上面free -m结果中,可以看到swap已经变成0了。

    释放cache占用的内存


      [root@localhost data]# sync
      [root@localhost data]# echo 1 > proc/sys/vm/drop_caches ;  
      [root@localhost data]# free -m
                   total        used        free      shared  buff/cache   available
      Mem:           2124         254        1784           9          84        1748
      Swap:             0           0           0


      可以看到目前剩余的内存为1784M,mysql数据库占用内存主要有2大块

      第一:buffer pool占用,

      第二:初始化连接占用的内存


      在这里设置mysql的buffer pool为1500M,会话的参数设置如下

        read_buffer_size = 32M
        read_rnd_buffer_size = 32M

        sort_buffer_size = 32M
        tmp_table_size = 32M
        max_heap_table_size=32M
        join_buffer_size=32M


        然后5个连接,开始做大查询操作,没过多久,mysql进程就因为OOM被kill了


          Aug 31 05:37:40 localhost kernel: Out of memory: Kill process 2534 (mysqld) score 658 or sacrifice child
          Aug 31 05:37:40 localhost kernelKilled process 2534 (mysqld), UID 1001, total-vm:1825792kBanon-rss:654388kBfile-rss:0kBshmem-rss:0kB


          mysql守护进程就开始启动mysql服务


            /u02/mysql/bin/mysqld_safe: 行 198:  2534 已杀死                  nohup u02/mysql/bin/mysqld --defaults-file=/u02/conf/my3308.cnf --basedir=/u02/mysql --datadir=/u02/data/3308 --plugin-dir=/u02/mysql/lib/plugin --user=mysql --log-error=/u02/log/3308/error.log --open-files-limit=65535 --pid-file=/u02/run/3308/mysqld.pid --socket=/u02/run/3308/mysql.sock --port=3308 < dev/null > dev/null 2>&1
            2020-08-30T21:37:40.375749Z mysqld_safe Number of processes running now: 0
            2020-08-30T21:37:40.407781Z mysqld_safe mysqld restarted
            2020-08-30T21:37:40.666886Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
            2020-08-30T21:37:40.667059Z 0 [Note] --secure-file-priv is set to NULL. Operations related to importing and exporting data are disabled
            2020-08-30T21:37:40.667112Z 0 [Note] u02/mysql/bin/mysqld (mysqld 5.7.26-log) starting as process 2954 ...
            2020-08-30T21:37:40.782412Z 0 [Warning] InnoDB: Using innodb_file_format is deprecated and the parameter may be removed in future releases. See http://dev.mysql.com/doc/refman/5.7/en/innodb-file-format.html
            2020-08-30T21:37:40.782684Z 0 [Note] InnoDB: PUNCH HOLE support available
            2020-08-30T21:37:40.782729Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2020-08-30T21:37:40.782754Z 0 [Note] InnoDB: Uses event mutexes
            2020-08-30T21:37:40.782772Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
            2020-08-30T21:37:40.782788Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
            2020-08-30T21:37:40.782841Z 0 [Note] InnoDB: Adjusting innodb_buffer_pool_instances from 8 to 1 since innodb_buffer_pool_size is less than 1024 MiB
            2020-08-30T21:37:40.784518Z 0 [Note] InnoDB: Number of pools: 1
            2020-08-30T21:37:40.784865Z 0 [Note] InnoDB: Using CPU crc32 instructions
            2020-08-30T21:37:40.789314Z 0 [Note] InnoDB: Initializing buffer pool, total size = 512M, instances = 1, chunk size = 128M
            2020-08-30T21:37:40.834948Z 0 [Note] InnoDB: Completed initialization of buffer pool
            2020-08-30T21:37:40.843612Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
            2020-08-30T21:37:40.859028Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
            2020-08-30T21:37:40.863176Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 5707394229
            2020-08-30T21:37:40.863221Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 5707394238
            2020-08-30T21:37:40.863231Z 0 [Note] InnoDB: Database was not shutdown normally!
            2020-08-30T21:37:40.863239Z 0 [Note] InnoDB: Starting crash recovery.
            2020-08-30T21:37:40.901955Z 0 [Note] InnoDB: Last MySQL binlog file position 0 43848, file name binlog.000025
            2020-08-30T21:37:41.075805Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
            2020-08-30T21:37:41.075860Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
            2020-08-30T21:37:41.075952Z 0 [Note] InnoDB: Setting file '/u02/log/3308/iblog/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
            2020-08-30T21:37:41.254016Z 0 [Note] InnoDB: File '/u02/log/3308/iblog/ibtmp1' size is now 12 MB.
            2020-08-30T21:37:41.255390Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
            2020-08-30T21:37:41.255421Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
            2020-08-30T21:37:41.256171Z 0 [Note] InnoDB: Waiting for purge to start
            2020-08-30T21:37:41.307237Z 0 [Note] InnoDB: 5.7.26 started; log sequence number 5707394238
            2020-08-30T21:37:41.308291Z 0 [Note] Plugin 'FEDERATED' is disabled.
            2020-08-30T21:37:41.310625Z 0 [Note] InnoDB: Loading buffer pool(s) from u02/log/3308/iblog/ib_buffer_pool
            2020-08-30T21:37:41.310785Z 0 [Note] InnoDB: Buffer pool(s) load completed at 200831  5:37:41 (/u02/log/3308/iblog/ib_buffer_pool was empty)
            2020-08-30T21:37:41.314568Z 0 [Note] Recovering after a crash using u02/log/3308/binlog/binlog
            2020-08-30T21:37:41.314730Z 0 [Note] Starting crash recovery...
            2020-08-30T21:37:41.314842Z 0 [Note] Crash recovery finished.
            2020-08-30T21:37:41.346280Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
            2020-08-30T21:37:41.346337Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
            2020-08-30T21:37:41.349079Z 0 [Warning] CA certificate ca.pem is self signed.
            2020-08-30T21:37:41.349341Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
            2020-08-30T21:37:41.350297Z 0 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3308
            2020-08-30T21:37:41.350399Z 0 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
            2020-08-30T21:37:41.350475Z 0 [Note] Server socket created on IP: '0.0.0.0'.
            2020-08-30T21:37:41.376794Z 0 [Note] Failed to start slave threads for channel ''
            2020-08-30T21:37:41.397237Z 0 [Note] Event Scheduler: Loaded 0 events
            2020-08-30T21:37:41.397480Z 0 [Note] u02/mysql/bin/mysqld: ready for connections.
            Version'5.7.26-log'  socket: '/u02/run/3308/mysql.sock'  port: 3308  Source distribution


            正在连接的会话自动中断


              ERROR 2013 (HY000): Lost connection to MySQL server during query
              mysql>


              在配置mysql参数,一定要考虑以下3个因素
              1.业务连接预期总数
              2.会话初始化内存
              3.buffer pool缓冲器大小


              下面先用sql查询一下会话的内存总大小和数据库buffer pool大小

                mysql> select (@@innodb_buffer_pool_size+@@innodb_log_buffer_size+@@key_buffer_size) / 1024 /1024 AS MEMORY_MB;
                +--------------+
                | MEMORY_MB    |
                +--------------+
                | 584.00000000 |
                +--------------+
                1 row in set (0.00 sec)


                mysql> select (@@read_buffer_size+@@read_rnd_buffer_size+@@sort_buffer_size+@@tmp_table_size+@@join_buffer_size+@@binlog_cache_size)/1024/1024 as MB;
                +--------------+
                | MB           |
                +--------------+
                | 160.03125000 |
                +--------------+
                1 row in set (0.00 sec)


                建议

                mysql的buffer pool加上会话内存总大小不超过服务器内存总大小的80%。
                假设,服务器内存为32G,mysql数据库内存设置就不要超过26G,如果buffer pool设置为20G,每个会话内存为160M,则6G能提供6 *1024 /160 =38 个安全连接,超过了,可能造成内存不够,用swap分区。











                最近写的一个定时任务,定时执行更新一张表数据的操作,但是每次执行都会导致服务器挂掉。。。。

                日志显示如下;




                Out of memory: Kill process 830(mysqld) score 611 or sacrifice child 

                MySQL占用的太多内存.这台服务器内存只有1g,看来是内存不足。

                Out of memory 问题,这通常是因为某时刻应用程序大量请求内存导致系统内存不足造成的,这通常会触发 Linux 内核里的 Out of Memory (OOM) killer,OOM killer 会杀掉某个进程以腾出内存留给系统用,不致于让系统立刻崩溃。


                在整体的系统运行过程中,数据服务器 MySQL 的压力是最大的,不仅占用很多的内存和 cpu 资源,而且占用着大部分的磁盘 io 资源。

                一、优化mysql缓存----

                mysql>show variables like '%query_cache%';     mysql本身是有对sql语句缓存的机制的,合理设置我们的mysql缓存可以降低数据库的io资源。


                    #query_cache_type= 查询缓存的方式(默认是 ON)
                    query_cache_size 如果你希望禁用查询缓存,设置 query_cache_size=0。禁用了查询缓存,将没有明显的开销,
                    query_cache_limit 不缓存大于这个值的结果。(缺省为 1M)


                找到/etc/mysql/my.cnf 配置进行修改(注意备份):首先我把query_cache_size这个值改为了0.禁用了缓存


                二、再提一下key_buffer_size


                key_buffer_size是对MyISAM表性能影响最大的一个参数

                key_buffer_size表示索引缓冲区的大小,严格说是它决定了数据库索引处理的速度,尤其是索引读的速度。

                可以检查状态值Key_read_requests和Key_reads,即可知道key_buffer_size设置是否合理。比例key_reads / key_read_requests应该尽可能的低,至少是1:100,1:1000更好,如果这个数值过大,就应该调高key_buffer_size的数值


                对于1G内存的机器,如果不使用MyISAM表,推荐值是16M(8-64M)


                重启mysql:  service mysqld start


                启动服务,测试。没有再报错。


                附上相关博客:http://blog.csdn.net/rainysia/article/details/8767946

                http://blog.sina.com.cn/s/blog_8a18c33d01011odu.html


                打赏

                本文链接:https://www.kinber.cn/post/3979.html 转载需授权!

                分享到:


                推荐本站淘宝优惠价购买喜欢的宝贝:

                image.png

                 您阅读本篇文章共花了: 

                群贤毕至

                访客