alexinka (@_alexinka) - Twitter Profili | Zamantika Mersobahis Locabet

alexinka@_alexinka·4 Ağu

@fchollet But what about OpenCL ? No future ?

English

1.5K

The big breakthrough for convnets was the first GPU-accelerated CUDA implementation, which immediately started winning first place in image classification competitions. Remember when that happened? I do. That was Dan Ciresan in 2011

Jürgen Schmidhuber@SchmidhuberAI

Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than today. 1987: Waibel applied Linnainmaa's 1970 backpropagation [3] to weight-sharing TDNNs with 1-dimensional convolutions [4]. 1988: Wei Zhang et al. applied "modern" backprop-trained 2-dimensional CNNs to character recognition [5]. All of the above was published in Japan 1979-1988. 1989: LeCun et al. applied CNNs again to character recognition (zip codes) [6,10]. 1990-93: Fukushima’s downsampling based on spatial averaging [1] was replaced by max-pooling for 1-D TDNNs (Yamaguchi et al.) [7] and 2-D CNNs (Weng et al.) [8]. 2011: Much later, my team with Dan Ciresan made max-pooling CNNs really fast on NVIDIA GPUs. In 2011, DanNet achieved the first superhuman pattern recognition result [9]. For a while, it enjoyed a monopoly: from May 2011 to Sept 2012, DanNet won every image recognition challenge it entered, 4 of them in a row. Admittedly, however, this was mostly about engineering & scaling up the basic insights from the previous millennium, profiting from much faster hardware. Some "AI experts" claim that "making CNNs work" (e.g., [5,6,9]) was as important as inventing them. But "making them work" largely depended on whether your lab was rich enough to buy the latest computers required to scale up the original work. It's the same as today. Basic research vs engineering/development - the R vs the D in R&D. REFERENCES [1] K. Fukushima (1979). Neural network model for a mechanism of pattern recognition unaffected by shift in position — Neocognitron. Trans. IECE, vol. J62-A, no. 10, pp. 658-665, 1979. [2] K. Fukushima (1969). Visual feature extraction by a multilayered network of analog threshold elements. IEEE Transactions on Systems Science and Cybernetics. 5 (4): 322-333. This work introduced rectified linear units (ReLUs), now used in many CNNs. [3] S. Linnainmaa (1970). Master's Thesis, Univ. Helsinki, 1970. The first publication on "modern" backpropagation, also known as the reverse mode of automatic differentiation. (See Schmidhuber's well-known backpropagation overview: "Who Invented Backpropagation?") [4] A. Waibel. Phoneme Recognition Using Time-Delay Neural Networks. Meeting of IEICE, Tokyo, Japan, 1987. Backpropagation for a weight-sharing TDNN with 1-dimensional convolutions. [5] W. Zhang, J. Tanida, K. Itoh, Y. Ichioka. Shift-invariant pattern recognition neural network and its optical architecture. Proc. Annual Conference of the Japan Society of Applied Physics, 1988. First backpropagation-trained 2-dimensional CNN, with applications to English character recognition. [6] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, 1989. See also Sec. 3 of [10]. [7] K. Yamaguchi, K. Sakamoto, A. Kenji, T. Akabane, Y. Fujimoto. A Neural Network for Speaker-Independent Isolated Word Recognition. First International Conference on Spoken Language Processing (ICSLP 90), Kobe, Japan, Nov 1990. A 1-dimensional convolutional TDNN using Max-Pooling instead of Fukushima's Spatial Averaging [1]. [8] Weng, J., Ahuja, N., and Huang, T. S. (1993). Learning recognition and segmentation of 3-D objects from 2-D images. Proc. 4th Intl. Conf. Computer Vision, Berlin, pp. 121-128. A 2-dimensional CNN whose downsampling layers use Max-Pooling (which has become very popular) instead of Fukushima's Spatial Averaging [1]. [9] In 2011, the fast and deep GPU-based CNN called DanNet (7+ layers) achieved the first superhuman performance in a computer vision contest. See overview: "2011: DanNet triggers deep CNN revolution." [10] How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. Technical Report IDSIA-23-23, Swiss AI Lab IDSIA, 14 Dec 2023. See also the YouTube video for the Bower Award Ceremony 2021: J. Schmidhuber lauds Kunihiko Fukushima.

English

119

1.1K

188.5K

alexinka@_alexinka·31 Tem

@fchollet What about supporting of OpenCL, Intel openapi, AMD rocm?

English

François Chollet@fchollet·30 Tem

The new Keras release (3.11.0) is out! Main upgrades: • int4 quantization with all backends • Support for Grain, a data i/o and streaming library inspired by tf-data, that is backend-agnostic • On the JAX side, integration with the NNX library -- if you're a NNX user, you can start using any Keras layer/model (including models from KerasHub) as a NNX module Release notes: github.com/keras-team/ker…

English

158

31.3K

alexinka@_alexinka·3 Tem

@fchollet Are there enough energy sources for such conclusions?

English

François Chollet@fchollet·3 Tem

We are now closer to the year 2100 than to 1950. Also closer to 2050 than to 2000. Time to start acting like it.

English

176

1.3K

73K

alexinka@_alexinka·30 Haz

@0xDeepRed @arcprize @fchollet @dwarkesh_sp Even from people

English

໊@0xDeepRed·30 Haz

@arcprize @fchollet @dwarkesh_sp nobody completed the first one yet 😂

English

352

ARC Prize@arcprize·30 Haz

ARC-AGI-3 Developer Preview * Hands on first look at ARC-AGI-3 (live demos & API access) * Fireside with @fchollet moderated by @dwarkesh_sp 7/17, San Francisco Open to sponsors & researchers of @arcprize (very limited public slots available)

English

450

53K

alexinka@_alexinka·30 Haz

@arcprize @fchollet @dwarkesh_sp Greate! Waiting for results

English

370

alexinka@_alexinka·2 May

@fchollet how do you define program dynamics?

English

François Chollet@fchollet·11 Nis

Generating *code* using a statistical prior about which token sequences are more likely is a fundamentally different task than generating *programs* by searching directly in the space of program dynamics

English

784

53.4K

alexinka@_alexinka·6 Şub

Разрезано 231 фруктов в классическом режиме @FruitNinja для iPhone! bit.ly/s28E3O

Русский

alexinka@_alexinka·19 Ara

I just got up to 284 in #DoodleJump Christmas Special Free!!! Beat that! bit.ly/DJCSFiOSft

English

alexinka@_alexinka·19 Ara

I just got up to 1,604 in #DoodleJump Christmas Special Free!!! Beat that! bit.ly/DJCSFiOSft

English

alexinka

Keşfet