Shallow copying
Because C++ does not know much about your class, the default copy constructor and default assignment operators it provides use a copying method known as a memberwise copy (also known as a shallow copy). This means that C++ copies each member of the class individually (using the assignment operator for overloaded operator=, and direct initialization for the copy constructor). When classes are simple (e.g. do not contain any dynamically allocated memory), this works very well.
For example, let’s take a look at our Fraction class:
#include <cassert>
#include <iostream>
class Fraction {
private:
// 这两个变量是静态变量, 不涉及动态申请
int m_numerator { 0 };
int m_denominator { 1 };
public:
// Default constructor, 默认的构造函数
Fraction(int numerator = 0, int denominator = 1)
: m_numerator{ numerator }
, m_denominator{ denominator } {
assert(denominator != 0);
}
// 这是一个友函数, 使用重载的运算符 <<
friend std::ostream& operator<<(std::ostream& out, const Fraction& f1);
};
// 将运算符 << 重载, 输出类 Fraction 的内容
std::ostream& operator<<(std::ostream& out, const Fraction& f1) {
out << f1.m_numerator << '/' << f1.m_denominator;
return out;
}
The default copy constructor and default assignment operator provided by the compiler for this class look something like this:
#include <cassert>
#include <iostream>
class Fraction {
private:
int m_numerator { 0 };
int m_denominator { 1 };
public:
// Default constructor
Fraction(int numerator = 0, int denominator = 1)
: m_numerator{ numerator }
, m_denominator{ denominator } {
assert(denominator != 0);
}
// Possible implementation of implicit copy constructor
Fraction(const Fraction& f)
: m_numerator{ f.m_numerator }
, m_denominator{ f.m_denominator } {
}
// Possible implementation of implicit assignment operator
Fraction& operator= (const Fraction& fraction) {
// self-assignment guard
if (this == &fraction)
return *this;
// do the copy
m_numerator = fraction.m_numerator;
m_denominator = fraction.m_denominator;
// return the existing object so we can chain this operator
return *this;
}
friend std::ostream& operator<<(std::ostream& out, const Fraction& f1) {
out << f1.m_numerator << '/' << f1.m_denominator;
return out;
}
};
Note that because these default versions work just fine for copying this class, there’s really no reason to write our own version of these functions in this case. 因为默认的构造函数与复制函数与上面做的事情是一样的, 因此我们没有必要自己写一个浅拷贝的拷贝构造函数与移动复制函数.
However, when designing classes that handle dynamically allocated memory, memberwise (shallow) copying can get us in a lot of trouble! This is because shallow copies of a pointer just copy the address of the pointer -- it does not allocate any memory or copy the contents being pointed to! 当我们在一个类中涉及到动态内存分配的时候, 例如我们的私有变量只是一个指针, 构造这个对象的时候申请内存空间, 实例如下, 此时, 浅拷贝就会出现问题.
Let’s take a look at an example of this:
#include <cstring> // for strlen()
#include <cassert> // for assert()
class MyString
{
private:
char* m_data{};
int m_length{};
public:
MyString(const char* source = "" )
{
assert(source); // make sure source isn't a null string
// Find the length of the string
// Plus one character for a terminator
m_length = std::strlen(source) + 1;
// Allocate a buffer equal to this length
m_data = new char[m_length];
// Copy the parameter string into our internal buffer
for (int i{ 0 }; i < m_length; ++i)
m_data[i] = source[i];
}
~MyString() // destructor
{
// We need to deallocate our string
delete[] m_data;
}
char* getString() { return m_data; }
int getLength() { return m_length; }
};
The above is a simple string class that allocates memory to hold a string that we pass in. Note that we have not defined a copy constructor or overloaded assignment operator. Consequently, C++ will provide a default copy constructor and default assignment operator that do a shallow copy. The copy constructor will look something like this:
// 实际上浅拷贝仅拷贝了指针, 并没有拷贝动态申请的内存中的内容
MyString::MyString(const MyString& source)
: m_length { source.m_length }
, m_data { source.m_data }
{
}
Note that m_data is just a shallow pointer copy of source.m_data, meaning they now both point to the same thing.
Now, consider the following snippet of code:
#include <iostream>
int main() {
// 使用默认的构造函数, 构造一个对象
MyString hello{ "Hello, world!" };
{
// 使用默认的拷贝构造函数
MyString copy{ hello }; // use default copy constructor
}
// copy is a local variable, so it gets destroyed here. The destructor deletes copy's string, which leaves hello with a dangling pointer
// copy 变量是一个本地变量, 在退出大括号 { 的时候会调用析构函数释放, 释放的指针指向的内存地址实际上是
// hello指向的地址, 也就是指向常量字符串 "Hello, world!"
std::cout << hello.getString() << '\n'; // this will have undefined behavior
// 释放之后指针的指针变成野指针
return 0;
}
When copy goes out of scope, the MyString destructor is called on copy. The destructor deletes the dynamically allocated memory that both copy.m_data and hello.m_data are pointing to! Consequently, by deleting copy, we’ve also (inadvertently) affected hello. Variable copy then gets destroyed, but hello.m_data is left pointing to the deleted (invalid) memory!
The root of this problem is the shallow copy done by the copy constructor -- doing a shallow copy on pointer values in a copy constructor or overloaded assignment operator is almost always asking for trouble.
Deep copying
One answer to this problem is to do a deep copy on any non-null pointers being copied. A deep copy allocates memory for the copy and then copies the actual value, so that the copy lives in distinct memory from the source. This way, the copy and source are distinct and will not affect each other in any way. Doing deep copies requires that we write our own copy constructors and overloaded assignment operators.
Let’s go ahead and show how this is done for our MyString class:
// assumes m_data is initialized
void MyString::deepCopy(const MyString& source) {
// first we need to deallocate any value that this string is holding!
// 释放旧的内存, 否则每次我们调用移动赋值函数会导致内存泄漏
delete[] m_data;
// because m_length is not a pointer, we can shallow copy it
// 非指针变量我们可以直接拷贝
m_length = source.m_length;
// m_data is a pointer, so we need to deep copy it if it is non-null
if (source.m_data) {
// allocate memory for our copy
m_data = new char[m_length];
// do the copy
for (int i{ 0 }; i < m_length; ++i)
m_data[i] = source.m_data[i];
}
else
m_data = nullptr;
}
// Copy constructor, 深拷贝的拷贝构造函数需要我们手动实现, 手动调用
MyString::MyString(const MyString& source)
{
deepCopy(source);
}
As you can see, this is quite a bit more involved than a simple shallow copy! First, we have to check to make sure source even has a string. If it does, then we allocate enough memory to hold a copy of that string. Finally, we have to manually copy the string.
Now let’s do the overloaded assignment operator. The overloaded assignment operator is slightly trickier:
// Assignment operator
MyString& MyString::operator=(const MyString& source) {
// check for self-assignment
if (this != &source)
{
// now do the deep copy
deepCopy(source);
}
return *this;
}
拷贝复制函数实际上我们是重载了 MyString 类的运算符 =
, 将这个运算符重载后, 进行依次深拷贝即可.
Note that our assignment operator is very similar to our copy constructor, but there are three major differences:
- We added a self-assignment check.
- We return
*this
so we can chain the assignment operator. - We need to explicitly deallocate any value that the string is already holding (so we don’t have a memory leak when m_data is reallocated later). This is handled inside deepCopy().
When the overloaded assignment operator is called, the item being assigned to may already contain a previous value, which we need to make sure we clean up before we assign memory for new values. For non-dynamically allocated variables (which are a fixed size), we don’t have to bother because the new value just overwrites the old one. However, for dynamically allocated variables, we need to explicitly deallocate any old memory before we allocate any new memory. If we don’t, the code will not crash, but we will have a memory leak that will eat away our free memory every time we do an assignment!
Summary
- The default copy constructor and default assignment operators do shallow copies, which is fine for classes that contain no dynamically allocated variables.
- Classes with dynamically allocated variables need to have a copy constructor and assignment operator that do a deep copy.
- Favor using classes in the standard library over doing your own memory management.