Nice programing

std :: fstream 버퍼링 대 수동 버퍼링 (수동 버퍼링으로 10 배 이득)?

nicepro 2021. 1. 8. 22:51
반응형

std :: fstream 버퍼링 대 수동 버퍼링 (수동 버퍼링으로 10 배 이득)?


두 가지 쓰기 구성을 테스트했습니다.

1) Fstream 버퍼링 :

// Initialization
const unsigned int length = 8192;
char buffer[length];
std::ofstream stream;
stream.rdbuf()->pubsetbuf(buffer, length);
stream.open("test.dat", std::ios::binary | std::ios::trunc)

// To write I use :
stream.write(reinterpret_cast<char*>(&x), sizeof(x));

2) 수동 버퍼링 :

// Initialization
const unsigned int length = 8192;
char buffer[length];
std::ofstream stream("test.dat", std::ios::binary | std::ios::trunc);

// Then I put manually the data in the buffer

// To write I use :
stream.write(buffer, length);

나는 같은 결과를 기대했다 ...

그러나 수동 버퍼링은 100MB의 파일을 쓰기 위해 성능을 10 배 향상시키고 fstream 버퍼링은 버퍼를 재정의하지 않고 정상적인 상황에 비해 아무것도 변경하지 않습니다.

누군가이 상황에 대한 설명이 있습니까?

편집 : 여기에 뉴스가 있습니다 : 슈퍼 컴퓨터 (리눅스 64 비트 아키텍처, 인텔 제온 8 코어, Lustre 파일 시스템 및 ... 잘 구성된 컴파일러)에서 방금 수행 한 벤치 마크 기준(그리고 그 이유를 설명하지 않습니다) 1kB 수동 버퍼의 "공명"...)

편집 2 : 그리고 1024 B의 공명 (누군가 그것에 대해 아이디어가 있다면 관심이 있습니다) : 여기에 이미지 설명 입력


This is basically due to function call overhead and indirection. The ofstream::write() method is inherited from ostream. That function is not inlined in libstdc++, which is the first source of overhead. Then ostream::write() has to call rdbuf()->sputn() to do the actual writing, which is a virtual function call.

On top of that, libstdc++ redirects sputn() to another virtual function xsputn() which adds another virtual function call.

If you put the characters into the buffer yourself, you can avoid that overhead.


I would like to explain what is the cause of the peak in the second chart.

In fact, virtual functions used by std::ofstream lead to the performance decreasing as we see on the first picture, but it does not gives an answer why the highest performance was when manual buffer size was less than 1024 bytes.

The problem relates to the high cost of writev() and write() system call and internal implementation of std::filebuf internal class of std::ofstream.

To show the how write() influences on the performance I did a simple test using dd tool on my Linux machine to copy 10MB file with different buffer sizes (bs option):

test@test$ time dd if=/dev/zero of=zero bs=256 count=40000
40000+0 records in
40000+0 records out
10240000 bytes (10 MB) copied, 2.36589 s, 4.3 MB/s

real    0m2.370s
user    0m0.000s
sys     0m0.952s
test$test: time dd if=/dev/zero of=zero bs=512 count=20000
20000+0 records in
20000+0 records out
10240000 bytes (10 MB) copied, 1.31708 s, 7.8 MB/s

real    0m1.324s
user    0m0.000s
sys     0m0.476s

test@test: time dd if=/dev/zero of=zero bs=1024 count=10000
10000+0 records in
10000+0 records out
10240000 bytes (10 MB) copied, 0.792634 s, 12.9 MB/s

real    0m0.798s
user    0m0.008s
sys     0m0.236s

test@test: time dd if=/dev/zero of=zero bs=4096 count=2500
2500+0 records in
2500+0 records out
10240000 bytes (10 MB) copied, 0.274074 s, 37.4 MB/s

real    0m0.293s
user    0m0.000s
sys     0m0.064s

As you can see that the less buffer is, the less write speed is and the much time dd spends in the system space. So, read/write speed decreases when buffer size decreases.

But why the highest speed was when manual buffer size was less than 1024 bytes in the topic creator manual buffer tests? Why it was almost constant?

The explanation relates to the std::ofstream implementation, especially to the std::basic_filebuf.

By default it uses 1024 bytes buffer (BUFSIZ variable). So, when you write your data using pieces less than 1024, writev() (not write()) system call is called at least once for two ofstream::write() operations (pieces have size of 1023 < 1024 - first is written to the buffer, and second forces writing of first and second). Based on it, we can conclude that ofstream::write() speed does not depend on the manual buffer size before the peak (write() is called at least twice rarely).

When you try writing greater or equal to 1024 bytes buffer at once using ofstream::write() call, writev() system call is called for each ofstream::write. So, you see that speed increases when manual buffer is greater than 1024 (after the peak).

Moreover, if you would like to set std::ofstream buffer greater than 1024 buffer (for example, 8192 bytes buffer) using streambuf::pubsetbuf() and call ostream::write() to write data using pieces of 1024 size, you would be suprised that write speed will be the same as you will use 1024 buffer. It is because implementation of std::basic_filebuf - the internal class of std::ofstream - is hard coded to force calling system writev() call for each ofstream::write() call when passed buffer is greater or equal to 1024 bytes (see basic_filebuf::xsputn() source code). There is also an open issue in the GCC bugzilla which was reported at 2014-11-05.

So, the solution of this problem can be done using two possible cases:

  • replace std::filebuf by your own class and redefine std::ofstream
  • devide a buffer, which has to be passed to the ofstream::write(), to the pieces less than 1024 and pass them to the ofstream::write() one by one
  • don't pass small pieces of data to the ofstream::write() to avoid decreasing performance on the virtual functions of std::ofstream

이 성능 동작 (가상 메서드 호출 / 간접으로 인한 모든 오버 헤드)은 일반적으로 큰 데이터 블록을 작성하는 경우 문제가되지 않는다는 기존 응답에 추가하고 싶습니다. 질문과 이러한 이전 답변에서 생략 된 것처럼 보이는 것은 (암시 적으로 이해되었을 수 있지만) 원래 코드가 매번 적은 수의 바이트를 작성한다는 것입니다. 다른 사람들을 위해 명확히하기 위해 : 큰 데이터 블록 (~ kB +)을 작성하는 경우 수동 버퍼링이 std::fstream의 버퍼링 을 사용하는 것과 상당한 성능 차이가있을 것으로 예상 할 이유가 없습니다 .

참조 URL : https://stackoverflow.com/questions/12997131/stdfstream-buffering-vs-manual-buffering-why-10x-gain-with-manual-buffering

반응형