首页 > 编程语言 >gdb 反汇编disas源码排列问题

gdb 反汇编disas源码排列问题

时间:2023-07-29 18:01:40浏览次数:31  
标签:le mle disas pc gdb 源码 source line Line

问题

在开发过程中,可能需要查看cpp文件生成的汇编代码来确认一些问题。由于单纯的汇编代码看起来并不太容易捋清楚内部逻辑,所以最好能够把源代码的位置列出来。在gdb的早期版本中,这个功能是通过disas命令的/m修饰符(选项)来实现的。

如果使用过这个选项就会发现,这个功能显示的结果使人非常困惑。正如gdb自己内置帮助文档对该选项的吐槽:

With a /m modifier, source lines are included (if available).
This view is "source centric": the output is in source line order,
regardless of any optimization that is present.  Only the main source file
is displayed, not those of, e.g., any inlined functions.
This modifier hasn't proved useful in practice and is deprecated
in favor of /s.

gdb该功能实现

这个实现其实比较直观,debug信息中包含了linetable_entry结构,这个结构包含了某一行对应的内存地址。gdb读取到[行号,地址]这种pair之后,按照行号进行排序。

这里面还有一个细节,就是如果行号相同,则会丢弃掉后添加的内容:

      if (le[i].line == le[i + 1].line && le[i].pc == le[i + 1].pc)
	continue;		/* Ignore duplicates.  */

完整的函数实现:

/* Each item represents a line-->pc (or the reverse) mapping.  This is
   somewhat more wasteful of space than one might wish, but since only
   the files which are actually debugged are read in to core, we don't
   waste much space.  */

struct linetable_entry
{
  /* The line number for this entry.  */
  int line;

  /* True if this PC is a good location to place a breakpoint for LINE.  */
  unsigned is_stmt : 1;

  /* The address for this entry.  */
  CORE_ADDR pc;
};

///@file: gdb-10.1\gdb\disasm.c
/* The idea here is to present a source-O-centric view of a
   function to the user.  This means that things are presented
   in source order, with (possibly) out of order assembly
   immediately following.

   N.B. This view is deprecated.  */

static void
do_mixed_source_and_assembly_deprecated
  (struct gdbarch *gdbarch, struct ui_out *uiout,
   struct symtab *symtab,
   CORE_ADDR low, CORE_ADDR high,
   int how_many, gdb_disassembly_flags flags)
{
  int newlines = 0;
  int nlines;
  struct linetable_entry *le;
  struct deprecated_dis_line_entry *mle;
  struct symtab_and_line sal;
  int i;
  int out_of_order = 0;
  int next_line = 0;
  int num_displayed = 0;
  print_source_lines_flags psl_flags = 0;

  gdb_assert (symtab != NULL && SYMTAB_LINETABLE (symtab) != NULL);

  nlines = SYMTAB_LINETABLE (symtab)->nitems;
  le = SYMTAB_LINETABLE (symtab)->item;

  if (flags & DISASSEMBLY_FILENAME)
    psl_flags |= PRINT_SOURCE_LINES_FILENAME;

  mle = (struct deprecated_dis_line_entry *)
    alloca (nlines * sizeof (struct deprecated_dis_line_entry));

  /* Copy linetable entries for this function into our data
     structure, creating end_pc's and setting out_of_order as
     appropriate.  */

  /* First, skip all the preceding functions.  */

  for (i = 0; i < nlines - 1 && le[i].pc < low; i++);

  /* Now, copy all entries before the end of this function.  */

  for (; i < nlines - 1 && le[i].pc < high; i++)
    {
      if (le[i].line == le[i + 1].line && le[i].pc == le[i + 1].pc)
	continue;		/* Ignore duplicates.  */

      /* Skip any end-of-function markers.  */
      if (le[i].line == 0)
	continue;

      mle[newlines].line = le[i].line;
      if (le[i].line > le[i + 1].line)
	out_of_order = 1;
      mle[newlines].start_pc = le[i].pc;
      mle[newlines].end_pc = le[i + 1].pc;
      newlines++;
    }

  /* If we're on the last line, and it's part of the function,
     then we need to get the end pc in a special way.  */

  if (i == nlines - 1 && le[i].pc < high)
    {
      mle[newlines].line = le[i].line;
      mle[newlines].start_pc = le[i].pc;
      sal = find_pc_line (le[i].pc, 0);
      mle[newlines].end_pc = sal.end;
      newlines++;
    }

  /* Now, sort mle by line #s (and, then by addresses within lines).  */

  if (out_of_order)
    std::sort (mle, mle + newlines, line_is_less_than);

  /* Now, for each line entry, emit the specified lines (unless
     they have been emitted before), followed by the assembly code
     for that line.  */

  ui_out_emit_list asm_insns_list (uiout, "asm_insns");

  gdb::optional<ui_out_emit_tuple> outer_tuple_emitter;
  gdb::optional<ui_out_emit_list> inner_list_emitter;

  for (i = 0; i < newlines; i++)
    {
      /* Print out everything from next_line to the current line.  */
      if (mle[i].line >= next_line)
	{
	  if (next_line != 0)
	    {
	      /* Just one line to print.  */
	      if (next_line == mle[i].line)
		{
		  outer_tuple_emitter.emplace (uiout, "src_and_asm_line");
		  print_source_lines (symtab, next_line, mle[i].line + 1, psl_flags);
		}
	      else
		{
		  /* Several source lines w/o asm instructions associated.  */
		  for (; next_line < mle[i].line; next_line++)
		    {
		      ui_out_emit_tuple tuple_emitter (uiout,
						       "src_and_asm_line");
		      print_source_lines (symtab, next_line, next_line + 1,
					  psl_flags);
		      ui_out_emit_list temp_list_emitter (uiout,
							  "line_asm_insn");
		    }
		  /* Print the last line and leave list open for
		     asm instructions to be added.  */
		  outer_tuple_emitter.emplace (uiout, "src_and_asm_line");
		  print_source_lines (symtab, next_line, mle[i].line + 1, psl_flags);
		}
	    }
	  else
	    {
	      outer_tuple_emitter.emplace (uiout, "src_and_asm_line");
	      print_source_lines (symtab, mle[i].line, mle[i].line + 1, psl_flags);
	    }

	  next_line = mle[i].line + 1;
	  inner_list_emitter.emplace (uiout, "line_asm_insn");
	}

      num_displayed += dump_insns (gdbarch, uiout,
				   mle[i].start_pc, mle[i].end_pc,
				   how_many, flags, NULL);

      /* When we've reached the end of the mle array, or we've seen the last
         assembly range for this source line, close out the list/tuple.  */
      if (i == (newlines - 1) || mle[i + 1].line > mle[i].line)
	{
	  inner_list_emitter.reset ();
	  outer_tuple_emitter.reset ();
	  uiout->text ("\n");
	}
      if (how_many >= 0 && num_displayed >= how_many)
	break;
    }
}

gcc生成的line信息

  • 行号信息
tsecer@harry: cat main.cpp 
#include <string.h>
#include <stdio.h>

struct tsecer
{
    tsecer()
    {
        memset(arr, 0, sizeof(0));
    }

    char arr[100];
};


int main(int argc, const char *argv[])
{
    for (int i = 0; i < argc; i++)
    {
        tsecer t;
        printf("hello world\n");
    }

    return 0;
}
tsecer@harry: g++ -g main.cpp 
tsecer@harry: readelf --debug-dump a.out                           
.debug_aranges 节的内容:

 行号语句:
  [0x000000b1]  扩充操作码 2: 设置地址为 0x4005c8
  [0x000000bc]  Special opcode 10: advance Address by 0 to 0x4005c8 and Line by 5 to 6
  [0x000000bd]  Special opcode 175: advance Address by 12 to 0x4005d4 and Line by 2 to 8
  [0x000000be]  Advance PC by constant 17 to 0x4005e5
  [0x000000bf]  Special opcode 76: advance Address by 5 to 0x4005ea and Line by 1 to 9
  [0x000000c0]  Advance PC by 3 to 0x4005ed
  [0x000000c2]  扩充操作码 1: 序列结束

  [0x000000c5]  扩充操作码 2: 设置地址为 0x400587
  [0x000000d0]  Advance Line by 15 to 16
  [0x000000d2]  复制
  [0x000000d3]  Special opcode 216: advance Address by 15 to 0x400596 and Line by 1 to 17
  [0x000000d4]  扩充操作码 4: set Discriminator to 1
  [0x000000d8]  将 is_stmt 设定为 0
  [0x000000d9]  Special opcode 103: advance Address by 7 to 0x40059d and Line by 0 to 17
  [0x000000da]  将 is_stmt 设定为 1
  [0x000000db]  Special opcode 119: advance Address by 8 to 0x4005a5 and Line by 2 to 19
  [0x000000dc]  Special opcode 174: advance Address by 12 to 0x4005b1 and Line by 1 to 20
  [0x000000dd]  Special opcode 142: advance Address by 10 to 0x4005bb and Line by -3 to 17
  [0x000000de]  Special opcode 95: advance Address by 6 to 0x4005c1 and Line by 6 to 23
  [0x000000df]  Special opcode 76: advance Address by 5 to 0x4005c6 and Line by 1 to 24
  [0x000000e0]  Advance PC by 2 to 0x4005c8
  [0x000000e2]  扩充操作码 1: 序列结束

对于典型的循环来说,它生成的机器指令并不是连续的。这个信息在debug中也有体现。注意下面三行中中间一行有一个行号减去3行的操作,这是由于源代码中for循环生成机器指令中有一个循环体最后的前向jump指令。

  [0x000000dc]  Special opcode 174: advance Address by 12 to 0x4005b1 and Line by 1 to 20
  [0x000000dd]  Special opcode 142: advance Address by 10 to 0x4005bb and Line by -3 to 17
  [0x000000de]  Special opcode 95: advance Address by 6 to 0x4005c1 and Line by 6 to 23

编译器优化

前面还只是在没有开启优化的时候看到的内容,如果开启了优化,这个输出就更加混乱了。

因为重复的行号会被丢弃,它们会被计算为不属于它的源代码,进而导致每个源代码行号对应的机器指令数量出现错乱。

标签:le,mle,disas,pc,gdb,源码,source,line,Line
From: https://www.cnblogs.com/tsecer/p/17590216.html

相关文章

  • 医院HIS信息管理系统源码 his系统源码
    HIS(HospitalInformationSystem)是覆盖医院所有业务和业务全过程的信息管理系统。HIS系统以财务信息、病人信息和物资信息为主线,通过对信息的收集、存储、传递、统计、分析、综合查询、报表输出和信息共享,及时为医院领导及各部门管理人员提供全面、准确的各种数据。主要功能模块1、......
  • mybatis源码研究、搭建mybatis源码运行的环境
    文章底部有个人公众号:热爱技术的小郑。主要分享开发知识、有兴趣的可以关注一手。前提研究源码、对我们的技术提高还是很有帮助的。简单的源码建议从mybatis入手。涉及到的设计模式不是很多。需要下载mybatis的源码和父工程依赖。注意下载的mybatis中的父工程依赖版本要对应。这里......
  • 使用APP源码搭建直播网站难不难
    过去,微博、微信、QQ等是互联网主要的引流渠道,而如今直播作为新兴的社交方式,是各行各业最热门的营销工具,现在各行各业都或多或少渗入到了网络直播,很多人问,搭建一个直播网站到底难不难,今天我们就来探讨下这个问题。1.确定网站需求:例如,确定开发者要提供哪些类型的直播内容,是Tob,还是T......
  • PAAS低代码企业应用程序开发平台源码:可实现功能应用边使用边修改
    一套可视化建模,描述式编程的企业应用程序开发平台。只需简单的点击鼠标,几乎任何人都可以创建功能强大的企业应用程序,实现业务流程自动化。企业创建的应用程序可以部署在移动,平板电脑和Web上,创建的应用程序可以很简单,也可以非常复杂,并且可以连接到几乎任何数据源。PAAS平台采用对象......
  • Django Form源码分析(2)
    1前置知识点1.1render函数 可以看到render函数先拿到模板,再进行模板渲染那么form函数是如何自动生成表单里的input标签呢,应该是在form里已经渲染好一遍生成了input标签放进了{"form":form}的集合render函数再进行渲染2Form函数2.1Form源码分析2.1.1Form初始化......
  • app直播源码平台开发防护技术鉴权功能的部署
    我们在生活中,常常会遇到身份验证的事情,我们每个人也会有证明身份的各种工具,就比如:在考试中我们会使用准考证进行身份验证;在乘坐高铁、飞机时,我们会用身份证或是高铁票、机票证明我们的身份。这也是为了防止有些人去顶替、冒充别人的身份去占有别人的权益的事情,也防止一些不法分子去......
  • 推荐带500创作模型的付费创作V2.1.0独立版系统源码
    ChatGPT付费创作系统V2.1.0提供最新的对应版本小程序端,上一版本增加了PC端绘画功能,绘画功能采用其他绘画接口–意间AI,本版新增了百度文心一言接口。后台一些小细节的优化及一些小BUG的处理,前端进行了些小细节优化,针对上几版大家非常关心的卡密兑换H5端及小程序端......
  • app直播源码平台开发防护技术鉴权功能的部署
     我们在生活中,常常会遇到身份验证的事情,我们每个人也会有证明身份的各种工具,就比如:在考试中我们会使用准考证进行身份验证;在乘坐高铁、飞机时,我们会用身份证或是高铁票、机票证明我们的身份。这也是为了防止有些人去顶替、冒充别人的身份去占有别人的权益的事情,也防止一些不法分......
  • kernel源码(二十四)文件系统
     1minix文件系统minix文件系统磁盘结构如下所示图中,整个磁盘被划分为360个磁盘块(每个磁盘块1Kb)引导块,MBR就在这个磁盘块中。当计算机加电,ROMBIOS将会自动读取该磁盘块到内存并执行其中的代码。分区,一块磁盘,我们可以最多有4个主分区。MBR大小为一个扇区大小,其中446byte......
  • 推荐短视频流量掘金付费进群系统源码-私域变现工具
    视频流量掘金付费进群系统源码ThinkPHP框架开发,百分百可搭建!近期爆火的流量掘金,自动化成交进群系统项目详细拆解,演示地址:runruncode.com/thinkphp/19493.html 不知道大家有没有听过,半自动挂机、流量掘金、流量变现、9.9自动进群系统等相关关键词的项目。 最近这套玩法非......