NULL_PTR
65 posts


Great to see this. It really helps to have regular dedicated periods of time where no work that's not fixing bugs and paper cuts is done.
A week is a good amount of time! If I were making user facing software I'd probably go as far as a week every month, OKRs be damned.
Zed@zeddotdev
This week is quality week! 🎉 During this period, you'll see fewer PRs adding new features, as we will be dedicating this time to squashing 🐛🐜 and addressing papercuts throughout Zed. Happy coding!
English

If you look at the tokenizer in the just-released MSFT Phi-2 model and you're used to Llama-like models, something may stand out: wtf is "Ġ"?
Well...
- The tokenizer is using SentencePiece as is common these days;
- SentencePiece uses regular whitespace as a segment delimiter, including file formats (eg token pairs for token merging are delimited by a space). So it needs to "escape" the whitespace.
- And indeed, by default SentencePiece escapes whitespace with "▁" (U+2581) which AFAICT semantically doesn't have much to do with a space, but looks sufficiently like it, so fine.
But... "Ġ" is emphatically not "▁". What gives?
If you look at the Unicode symbol for "Ġ", which is U+0120, and you know your ASCII table, this is suspicious because it's as if someone said "escape spaces by adding 256 to them".
And indeed, this seems to date back to GPT-2 tokenization code which represented any byte outside of a small set of known "valid" Unicode codepoints as 256+n... and space (32 = 0x20) wasn't part of the set: #L8-L28" target="_blank" rel="nofollow noopener">github.com/openai/gpt-2/b…
I don't really have a point here but I'm actively questioning my life choices right now.

English
NULL_PTR retweetledi

dear imgui 1.77 release: github.com/ocornut/imgui/…
- general maintenance release with 25+ fixes and small additions in main branch.
- click for changelog with details individual branches (always a recommended read because you'll discover features)




English

@stefandd7 @wvo For-each iterator is a reference to the target element (unlike for example, C# where iterating over a value-type array will only give you a copy)
English

Benchmark comparing Lobster with Lua and LuaJIT (it is faster than both): github.com/stefandd/Tic4
English

Somehow, one of the major annoyances of working with Visual Studio
Arseny Kapoulkine 🇺🇦@zeuxcg
I swear, every second time I open @VisualStudio property window, I hit an issue where the configuration isn't what's currently selected for build and spend minutes being confused as to why the changes didn't work. This worked fine in 2012 and changed circa 2015 :( Whyyyyyyyyy.
English

@aras_p Looking at some C++ proposals, there seems to be an over-reliance on "it's zero overhead if you compile with -O3". Like you've mentioned, the result is that the feature is now being avoided in production, because debug builds are important and huge overhead makes those unusable
English

My previous tweet on C++20 ranges example kinda exploded, so I thought it would be good to write down wot I think in a blog form. "Modern C++ Lamentations" aras-p.info/blog/2018/12/2…
It's kinda random and not structured well, but hey I'm also trying to have a vacation here :)
English
NULL_PTR retweetledi

@__Necrys__ А у меня совсем наоборот, когда выходит альбом и "Парни, у вас был такой хороший стиль, куда вас понесло?"
Русский
NULL_PTR retweetledi

Narita Boy looks amazing [please don't fail] -The retro futuristic pixel game on @Kickstarter kck.st/2lqJBoD
English

@Lattelecom what's up with your connection speedtest.net/my-result/6058…
English

@grumpygiant if you want strong typedefs, you can use 'enum Handle: int{};' from C++11, as long as you don't require all the arithmetic ops
English

@grumpygiant your first point of 'Calculating the size of a static array' is covered by std::size open-std.org/jtc1/sc22/wg21… in C++17
English
NULL_PTR retweetledi

@CatalystMaker Any chance that this new Crimson software won't put itself in every explorer context menu?
English

