首页 > 其他分享 >upc 8378: Floating-Point Numbers(模拟浮点数运算)

upc 8378: Floating-Point Numbers(模拟浮点数运算)

时间:2023-06-02 18:32:52浏览次数:42  
标签:binary point 浮点数 Point number upc floating bits ll


8378: Floating-Point Numbers

时间限制: 1 Sec  内存限制: 128 MB
提交: 10  解决: 4
[提交] [状态] [讨论版] [命题人:admin]

题目描述

In this problem, we consider floating-point number formats, data representation formats to approximate real numbers on computers.
Scientific notation is a method to express a number, frequently used for numbers too large or too small to be written tersely in usual decimal form. In scientific notation, all numbers are written in the form m × 10e. Here, m (called significand) is a number greater than or equal to 1 and less than 10, and e (called exponent) is an integer. For example, a number 13.5 is equal to 1.35×101, so we can express it in scientific notation with significand 1.35 and exponent 1.
As binary number representation is convenient on computers, let's consider binary scientific notation with base two, instead of ten. In binary scientific notation, all numbers are written in the form m × 2e. Since the base is two, m is limited to be less than 2. For example, 13.5 is equal to 1.6875×23, so we can express it in binary scientific notation with significand 1.6875 and exponent 3. The significand 1.6875 is equal to 1 + 1/2 + 1/8 + 1/16, which is 1.10112 in binary notation. Similarly, the exponent 3 can be expressed as 112 in binary notation.
A floating-point number expresses a number in binary scientific notation in finite number of bits. Although the accuracy of the significand and the range of the exponent are limited by the number of bits, we can express numbers in a wide range with reasonably high accuracy.
In this problem, we consider a 64-bit floating-point number format, simplified from one actually used widely, in which only those numbers greater than or equal to 1 can be expressed. Here, the first 12 bits are used for the exponent and the remaining 52 bits for the significand. Let's denote the 64 bits of a floating-point number by b64...b1. With e an unsigned binary integer (b64...b53)2, and with m a binary fraction represented by the remaining 52 bits plus one (1.b52...b1)2, the floating-point number represents the number m × 2e.
We show below the bit string of the representation of 13.5 in the format described above.

In floating-point addition operations, the results have to be approximated by numbers representable in floating-point format. Here, we assume that the approximation is by truncation. When the sum of two floating-point numbers a and b is expressed in binary scientific notation as a + b = m × 2e (1 ≤ m < 2, 0 ≤ e < 212), the result of addition operation on them will be a floating-point number with its first 12 bits representing e as an unsigned integer and the remaining 52 bits representing the first 52 bits of the binary fraction of m.
A disadvantage of this approximation method is that the approximation error accumulates easily. To verify this, let's make an experiment of adding a floating-point number many times, as in the pseudocode shown below. Here, s and a are floating-point numbers, and the results of individual addition are approximated as described above.
s := a
for n times {
    s := s + a
}
For a given floating-point number a and a number of repetitions n, compute the bits of the floating-point number s when the above pseudocode finishes.

输入

The input consists of at most 1000 datasets, each in the following format.

b52...b1 
n is the number of repetitions. (1 ≤ n ≤ 1018) For each i, bi is either 0 or 1. As for the floating-point number a in the pseudocode, the exponent is 0 and the significand is b52...b1.

The end of the input is indicated by a line containing a zero.

输出

For each dataset, the 64 bits of the floating-point number s after finishing the pseudocode should be output as a sequence of 64 digits, each being 0 or 1 in one line.

样例输入

1
0000000000000000000000000000000000000000000000000000
2
0000000000000000000000000000000000000000000000000000
3
0000000000000000000000000000000000000000000000000000
4
0000000000000000000000000000000000000000000000000000
7
1101000000000000000000000000000000000000000000000000
100
1100011010100001100111100101000111001001111100101011
123456789
1010101010101010101010101010101010101010101010101010
1000000000000000000
1111111111111111111111111111111111111111111111111111
0

样例输出

0000000000010000000000000000000000000000000000000000000000000000
0000000000011000000000000000000000000000000000000000000000000000
0000000000100000000000000000000000000000000000000000000000000000
0000000000100100000000000000000000000000000000000000000000000000
0000000000111101000000000000000000000000000000000000000000000000
0000000001110110011010111011100001101110110010001001010101111111
0000000110111000100001110101011001000111100001010011110101011000
0000001101010000000000000000000000000000000000000000000000000000


来源/分类

ICPC Japan IISF 2018 

[提交] [状态]

【题意】

看图,解释了小数的科学计数法存储方式,前12位存指数,后52位存小数部分,整数部分默认为1。

给出n,和小数a的后52位,求n+1个a相加的结果(注意是逐个相加,并考虑每一步的误差损失),用题目描述的存储方式输出。

【分析】

错误做法1:我第一遍直接用java大数计算出来 (n+1)*a 的具体值,然后从最高位的第二位开始数52位保留下来,作为小数,剩下的位数长度作为指数e。结果前7个样例是对的,最后一个是错的。错误:题目声明了这种存储方式是有误差损失的,而我直接大数算出具体值是不损失的,故答案不符。

错误做法2:放弃大数之后,我想到用快速幂的方法实现这n次加法,然而得出的结果还是最后一个是错的。原因:快速幂过程中,只会加倍,并不能考虑到逐个加a时产生的损失, 因为逐个加a时,没加一次都有可能产生误差,而用快速幂的思想只能在加倍是产生误差,大概可以理解成 产生的误差不一样。

 

正确做法:考虑一个过程,我们是逐个加a,其实很多时候,我们加上一个a,数值的最高位可能并不变化,也就是不产生进位,那指数e肯定不会变化。

所以我们寻找临界状态,即现在的值 加上 几个a,能恰好进位,这个用普通除法就能算出。然后我们直接让数值加上这些a,进位。一旦进位,指数e就可以加1,而小数部分就需要右移一位了。 不断重复这个过程,知道n用尽或者小数部分不足以产生影响。

小数部分直接用long long型变量存下来。

【代码】

/****
***author: winter2121
****/
#include<bits/stdc++.h>
#define rep(i,s,t) for(int i=s;i<=t;i++)
#define SI(i) scanf("%d",&i)
#define PI(i) printf("%d\n",i)
using namespace std;
typedef long long ll;
const int mod=1e9+7;
const int MAX=2e5+5;
const int INF=0x3f3f3f3f;
const double eps=1e-8;
int dir[9][2]={0,1,0,-1,1,0,-1,0, -1,-1,-1,1,1,-1,1,1};
template<class T>bool gmax(T &a,T b){return a<b?a=b,1:0;}
template<class T>bool gmin(T &a,T b){return a>b?a=b,1:0;}
template<class T>void gmod(T &a,T b){a=((a+b)%mod+mod)%mod;}
 
ll gcd(ll a,ll b){ while(b) b^=a^=b^=a%=b; return a;}
ll inv(ll b){return b==1?1:(mod-mod/b)*inv(mod%b)%mod;}
ll qpow(ll n,ll m)
{
    n%=mod;
    ll ans=1;
    for(;m;m>>=1)
    {
        if(m&1)ans=ans*n%mod;
        n=n*n%mod;
    }
    return ans;
}
 
int main()
{
    ll n;
    int bit;
    while(scanf("%lld",&n),n)
    {
        ll num=1ll<<52;
        for(int i=51;i>=0;i--)
        {
            scanf("%1d",&bit);
            num|=((ll)bit)<<i;
        }
        ll ans=0,res=num;
        while(n>0)
        {
         //   cout<<num<<endl;
            ll Tim=((1ll<<53)-res +num-1)/num; //Tim次就能加到1<<53
         //   cout<<n<<' '<<Tim<<endl;
            if(Tim>n) //超了
            {
                res+=n*num;
                break;
            }
            else
            {
                res+=Tim*num;
                ans++; //进位
                res>>=1; //产生了进位,幂加1,所以res右移1
                num>>=1; //幂涨了,所以最初的小数也就右移了
                if(num<1)break; //当num不足1时,不会再对后面的加法产生影响。
                n-=Tim;
            }
        }
        for(int i=11;i>=0;i--)
            printf("%d",((1<<i)&ans)?1:0);
        for(int i=51;i>=0;i--)
            printf("%d",((1ll<<i)&res)?1:0);
        printf("\n");
    }
    return 0;
}

 

标签:binary,point,浮点数,Point,number,upc,floating,bits,ll
From: https://blog.51cto.com/u_16125110/6404558

相关文章

  • upc 6902: Trie树 (状压dp)
    6902:Trie树时间限制:1Sec  内存限制:128MB提交:137  解决:19[提交][状态][讨论版][命题人:admin]题目描述字母(Trie)树是一个表示一个字符串集合中所有字符串的前缀的数据结构,其有如下特征:1.树的每一条边表示字母表中的一个字母2.树根表示一个空的前缀3.树上所有......
  • 【十五】breakpoint()函数(1) - 3
    【十五】breakpoint()函数(1)-3.7+【1】作用Python3.7添加了breakpoint(),此函数将您放入调用站点的调试器中。具体来说,它调用sys.breakpointhook(),直接传递args和kws。默认情况下,sys.breakpointhook()调用pdb.set_trace(),不需要参数。在这种情况下,它纯粹是一个方便的函......
  • 报错问题谷粒商城 Oss endpoint can‘t be empty
    报错信息:Causedby:java.lang.IllegalArgumentException:Ossendpointcan’tbeempty.网上查了一下说有两种可能第一种是springboot和springcloud版本不对应第二种错误说的是oss.yml格式错误 建议优先检查yml格式我的因为那天改配置的时候被家里猫按到了,然后没有发现,检......
  • THUPC2023 游记
    五月二十七号上午,我和国家队一行人出发前往北京。说起来这是我第五次来北京了,但上一次来还是在几乎六年前。那时同行的伙伴在我刚上初中时成为了我的好朋友,高中也被分到了一个班,但现在已经没有交流了。我努力回忆着自己对北京的所有印象,但除了很小的时候父亲出差带我到北京玩,其它......
  • 驱动开发:内核读写内存浮点数
    如前所述,在前几章内容中笔者简单介绍了内存读写的基本实现方式,这其中包括了CR3切换读写,MDL映射读写,内存拷贝读写,本章将在如前所述的读写函数进一步封装,并以此来实现驱动读写内存浮点数的目的。内存浮点数的读写依赖于读写内存字节的实现,因为浮点数本质上也可以看作是一个字节集,对......
  • THUPC2023 游记
    之前有人说我这个赛季一年都没写过游记,刚好不想卷这次THU之行给我留下了很深印象,所以就来写篇游记浅浅记录一下。和Linshey,chen_03两位强神也是两位老队友组的队,队名叫「联合省选第一年」,源于我们省今年首次加入联合省选。顺带一提,去年我们的队伍名叫「最后的FJOIer」。5.2......
  • 什么是浮点数加减运算里的对阶,阶码和尾数
    在浮点数加减运算中,对阶是一种重要的步骤,它用于将参与运算的浮点数调整为同一数量级,以便进行精确的计算。对阶涉及到阶码和尾数的概念。在本文中,我将解释这些概念并提供具体的例子,以便更好地理解。首先,浮点数表示法是一种用于表示实数的方法,其中数值被分为阶码和尾数两部分。通常......
  • Problem A: 平面上的点和线——Point类、Line类 (I)
    HomeWebBoardProblemSetStandingStatusStatisticsProblemA:平面上的点和线——Point类、Line类(I)TimeLimit:1Sec  MemoryLimit:128MBSubmit:3609  Solved:2357[Submit][Status][WebBoard]Description在数学上,平面直角坐标系上的点......
  • Problem A: 平面上的点——Point类 (I)
    ProblemA:平面上的点——Point类(I)TimeLimit:1Sec  MemoryLimit:4MBSubmit:8255  Solved:3705[Submit][Status][WebBoard]Description在数学上,平面直角坐标系上的点用X轴和Y轴上的两个坐标值唯一确定。现在我们封装一个“Point类”来实现平面上......
  • upc 6888: 守卫(区间dp O(n^2) )
    6888:守卫时间限制:1Sec  内存限制:512MB提交:102  解决:33[提交][状态][讨论版][命题人:admin]题目描述九条可怜是一个热爱运动的女孩子。这一天她去爬山,她的父亲为了她的安全,雇了一些保镖,让他们固定地呆在在山的某些位置,来实时监视九条可怜,从而保护她。具体来......