首页 > 其他分享 >Crash on AIX produces no core or a truncated core

Crash on AIX produces no core or a truncated core

时间:2023-04-19 17:25:35浏览次数:39  
标签:core set Crash AIX dump process will file

Crash on AIX produces no core or a truncated core

Troubleshooting


Problem

This document outlines what needs to be done to ensure that a full core file is produced on AIX if WebSphere Application Server crashes.

Resolving The Problem

System core dump files should generate in WebSphere Application Server during a crash, or if manually triggered, and in some OutOfMemory instances.  A good system core dump is needed to diagnose crashes, some OutOfMemory issues, and some other issues as needed.  A few conditions can cause the core dumps to be truncated and unusable.
NOTE: There is a different technote that discusses issues where the process does not record a crash event.


1. SET ULIMITS
See Also: Guidelines for Setting Ulimits

The ulimits for core and fsize need to be tuned so that the hard and soft limits are set to unlimited. This may require root access to change. 

For setting them at a global level, you would need to edit the /etc/security/limits file to change the core and file settings for hard and soft limits. However, if the application server is started by the init process at startup, these settings will not take effect. You will need to use the ulimit command line settings directly in the init.d script.

If you want to validate an already running application server process, capture a javacore (kill -3 PID), open it with a text editor and check for "RLIMIT_CORE" and "RLIMIT_FSIZE".
** NOTE: If the appserver is associated with a nodeagent, BOTH the nodeagent and the appserver MUST be restarted to pick up the change.  In the case where this installation doesn't have a nodeagent, the appserver must be restarted to pick up the change.

2. CONFIGURE FULL CORE ON THE OPERATING SYSTEM
Check your OS configuration (in the SMIT tool) to see if the fullcore option is set to true.

The IBM SDK will notify you in the native_stderr.log (or your logging for standard error is directed) if this is not set via this string output when a core dump is generated:

Note: "Enable full CORE dump" in smit is set to FALSE and as a result there will be limited threading information in core file.

If you do not have access to the SMIT administration tool, the following flag can be set from the command line (as the root user):

To set full core generation:
chdev -a fullcore=true -lsys0

To verify full core is set:
lsattr -Elsys0 | grep full



3. DISK SPACE
Check your partitions where WebSphere Application Server resides and make sure there is enough space for the dump to be produced. Usually an error message will be seen in the native_stderr.log that indicates if the core was unable to be written.

To check all of your partitions, execute this command (the -k is for kilobytes):

df -k
=======================================================
** Stop after step 3 Only do steps below if specifically instructed by IBM Support
4. DISABLE SIGNAL HANDLERS
To force the operating system to handle all signals sent to the JVM process, you can disable all JVM signal handlers.

For IBM SDK 6.0 and later, set this JVM argument:
-Xrs

NOTE: On SDK 6.0 and later, to prevent unintentional crashes due to SIGTRAP, clear the shared class cache by executing <WAS_HOME>/bin/clearClassCache.sh



5. EXECUTE "pdump.sh" SCRIPT
In cases where core files are still not being produced, you can execute the attached script pdump.sh to extract information from the running process. This is especially helpful if you suspect the process is in a zombie state and does not respond to any signals.

You can download the latest version from this location:
ftp://ftp.software.ibm.com/aix/tools/debug/pdump.sh

pdump.sh <Java_PID>


This will create a file pdump.java.###.txt file. Locate the line containing the string "sigcatch". If SEGV is listed in output, then the signal is being caught. Both SEGV and SIGSEGV represent signal 11.



Additional Questions:
What happens if I do not have write permission in the profile's root directory, or the directory I am redirecting javacores, heapdumps, and system core files to?

This will result in a failure when writing these files to the system. Check for an error in the native_stderr.log, as it may try to write the dump to an alternate folder (such as /tmp).



Even with all ulimit settings set to unlimited, core files are truncated at 2GB?

There is a limitation on 32-bit processes which can be worked around if you enable large file support..
Using a 64-bit version of WebSphere Application Server also resolves this limitation, although if you run out of disk space the dump can still be truncated.



Can I test my configuration to see if a core can be generated?

Yes you can simulate a crash by sending a signal 11 to the JVM process. This will terminate the process.

kill -11 PID


An alternative is to use the gencore command. This produces a core file and keeps the process running.

gencore PID  

Related Information

Submitting information to IBM support

Steps to getting support

MustGather: Read first

Troubleshooting guide

   

------------------------------------------------------------------------------------------
如果你觉得文章有用,欢迎打赏

 

 

标签:core,set,Crash,AIX,dump,process,will,file
From: https://www.cnblogs.com/z-cm/p/17333977.html

相关文章

  • AspNetCore 成长杂记(一):JWT授权鉴权之生成JWT(其一)
    引子最近不知怎么的,自从学了WebAPI(为什么是这个,而不是MVC,还不是因为MVC的Razor语法比较难学,生态不如现有的Vue等框架,webapi很好的结合了前端生态)以后,使用别人的组件一帆风顺,但是不知其意,突然很想自己实现一个基于的JWT认证服务,来好好了解一下这个内容。起步自从Session-Cooki......
  • api-ms-win-core-file-l1-2-0.dll文件问题解决
    其实很多用户玩单机游戏或者安装软件的时候就出现过这种问题,如果是新手第一时间会认为是软件或游戏出错了,其实并不是这样,其主要原因就是你电脑系统的该dll文件丢失了或者损坏了,这时你只需下载这个api-ms-win-core-file-l1-2-0.dll文件进行安装(前提是找到适合的版本),当我们执行某......
  • 19c环境,运行DBCA创建CDB时,报错ORA-01519: error while processing file:?/rdbms/admin
    1、同事新搭建的一套19CRAC,补丁为19.10,运行DBCA安装CDB数据库时报错,错误日志如下所示:ORA-01519:errorwhileprocessingfile:?/rdbms/admin/dcore.bsq.....ORA-00604:erroroccurredatrecursiveSQLlevel1ORA-01119:errorincreatingdatabasefile'+DATA01/CDB1/pdb......
  • ASP.NET Core - 缓存之分布式缓存
    分布式缓存是由多个应用服务器共享的缓存,通常作为访问它的应用服务器的外部服务进行维护。分布式缓存可以提高ASP.NETCore应用的性能和可伸缩性,尤其是当应用由云服务或服务器场托管时。与其他将缓存数据存储在单个应用服务器上的缓存方案相比,分布式缓存具有多个优势。当分发......
  • AIX系统安装
    注:小机初始化1、连接HMC接口,登录asmi管理系统进行小型机的初始化(admin/admin)2、登录hmc虚拟机(hscroot/abc123)、移出原有分区,再新添加进服务器;更新密码然后进行初始化(时间约十分钟左右)AIX安装按键1进入SMSmenu菜单修改安装设置选择自己的安装模式(输入5列出所有)输入N回车翻下一页,找......
  • 成功解决OSError: [E050] Can’t find model ‘en_core_web_sm’.
    成功解决OSError:[E050]Can'tfindmodel'en_core_web_sm'.问题描述在安装spacy包之后,再加载'en_core_web_sm'语言模型时,报出OSError:[E050]Can'tfindmodel'en_core_web_sm'.Itdoesn'tseemtobeaPythonpackageoravalidpathtoa......
  • 2023 ASP.NET Core 开发者路线图
    链接ASP.NETCoreDeveloperRoadmap......
  • 面向接口编程实践之aspnetcoreapi的抽象
    最为一名越过菜鸟之后的开发,需要做接口开发。下面做一个纯粹的接口编程的实例demo,仅仅是一个webapi接口的抽象。下面是代码接口,AbsEFWork是webapi,BaseEntityFramework是一个接口库。先介绍一下webapi的实现,代码是从底层往上层写的,阅读代码的习惯应该是自上向下。publiccla......
  • 【愚公系列】2023年04月 .NET CORE工具案例-DeveloperSharp的使用(数据库)
    (文章目录)前言DeveloperSharp是一个研发中大型项目必备的系统平台。也是一个低代码平台。它主要包括了如下一些功能:基于Sql语句、存储过程、事务、分页的数据库操作。并几乎支持市面上所有种类的数据库。图片操作。裁剪、缩放、加水印。http请求调用(Post与Get)高效分页We......
  • 类库项目无法引用Microsoft.AspNetCore程序集下的类库
    在类库项目中不能直接引用WebApplicationBuilder、ApplicationBuilder等类,这些类位于Microsoft.ASPNetCore程序集中,但是无法通过Nuget包引用,因为该Nuget包的版本已经不再支持,很久没有更新过了。解决方法:在项目文件csproj文件中,在ItemGroup下手动添加引用<FrameworkReferenceInc......